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Abstract 

The  United  States  Air  Force  has  a  pressing  need  for  new  methods  of  hyperspec- 
tral  imaging.  All  current  hyperspectral  imaging  technologies  require  long  exposure 
times,  since  each  involves  filtering  the  available  light,  either  spatially  or  according 
to  color.  We  consider  a  recently  proposed  method  for  hypserspectral  imaging  that 
promises  shorter  exposure  times.  This  new  method  applies  the  mathematical  prin¬ 
ciples  of  tomography  to  the  hyperspectral  data  cube.  Known  as  chromotomography, 
this  method  uses  a  spinning  prism  to  essentially  capture  the  integrals  of  this  cube 
over  many  rotations  of  a  single  line.  This  thesis  addresses  some  of  the  mathematical 
issues  that  arise  when  trying  to  reconstruct  a  hyperspectral  image  from  chromoto- 
mographic  measurements.  After  reviewing  some  of  the  mathematical  shortcomings 
of  the  current  state  of  the  art — which  arise  from  the  technical  difficulties  of  working 
with  the  continuous- variable  X-ray  transform — we  make  three  contributions.  First, 
we  introduce  a  mathematically  rigorous,  discrete,  X-ray  transform  that  is  somewhat 
faithful  to  its  continuous  cousin.  Second,  we  show  how  under  a  few  simplifying 
assumptions,  our  discrete  transform  can  be  generalized  so  as  to  provide  a  good  ap¬ 
proximation  of  the  continuous  one.  This  discretization  allows  us  to  apply  modern 
finite-dimensional  optimization  methods  to  the  chromotomographic  reconstruction 
problem.  Our  third  contribution  is  to  apply  a  popular  new  example  of  such  a  method, 
known  as  Split  Bregman  iteration. 
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A  DISCRETE  X-RAY  TRANSFORM  FOR  CHROMOTOMOGRAPHIC 

HYPERSPECTRAL  IMAGING 

I.  Introduction 

The  United  States  Air  Force  has  a  pressing  need  for  new  sensor  technologies 
that  give  us  the  capability  to  analyze  the  chemical  nature  of  spontaneous  and  brief 
real  world  events  [13, 17].  A  popular  modern  method  for  remotely  sensing  chem¬ 
ical  information  is  hyperspectral  imaging ,  that  is,  taking  photographs  with  many 
color  channels,  unlike  traditional  photographs  that  either  have  one  color  channel 
(grayscale)  or  three  color  channels  (red-green-blue).  Unfortunately,  all  current  hy¬ 
perspectral  imaging  technologies  require  long  exposure  times,  since  each  involves 
filtering  the  available  light,  either  spatially  or  according  to  color.  That  is,  even 
state-of-the-art  hyperspectral  sensors  are  incapable  of  imaging  transient  events  [3]. 
For  this  reason,  in  the  past  decade,  researchers  have  been  proposing  new  hyperspec¬ 
tral  imaging  systems  that  make  better  use  of  the  available  light,  thereby  allowing 
shorter  exposure  times  [1,3, 6-8, 10, 13, 18]. 

Some  of  these  new  methods  of  hyperspectral  imaging  are  based  on  principles 
of  tomography.  Traditional  tomography  passes  waves  of  energy  through  solid  objects 
(such  as  human  bodies)  to  construct  three-dimensional  images  (data  cubes)  repre¬ 
senting  the  objects’  composition.  Some  recently  proposed  methods  of  hyperspectral 
imaging  involve  tomography  with  color.  To  be  precise  chromotomography  (CT) — 
not  to  be  confused  with  chromatography — is  a  type  of  hyperspectral  imaging  that 
uses  the  mathematical  principles  of  tomography  in  order  to  capture  a  hyperspectral 
image  via  the  use  of  a  spinning  prism  [6].  This  thesis  addresses  some  of  the  mathe¬ 
matical  issues  that  arise  when  trying  to  reconstruct  a  hyperspectral  image  from  the 
data  produced  by  a  CT  camera  such  as  the  chromotomographic  experiment  (CTEx) 
developed  at  the  Air  Force  Institute  of  Technology  (AFIT)  [17]. 
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In  particular,  we  make  three  major  contributions:  (1)  a  mathematically  rig¬ 
orous,  discrete  model  of  a  CT  camera  system  that  is  somewhat  realistic;  (2)  an 
extension  of  this  theory  which  shows  how  to  slightly  generalize  our  discrete  trans¬ 
form  so  as  to  make  it  approximate  our  continuous  one  very  well;  and  (3)  the  first 
application  of  a  popular  new  numerical  optimization  technique — the  Split  Bregman 
method — to  our  reconstruction  problem. 

1.1  Mathematical  preliminaries 

We  now  briefly  outline  our  approach.  We  consider  a  linear  model,  that  is,  we 
treat  the  camera  system  as  a  linear  operator  L.  This  enables  us  to  examine  the 
reconstruction  problem  using  linear  least  squares.  In  order  to  do  this  we  need  an 
expression  for  the  operator  itself,  as  well  as  one  for  its  adjoint  (transpose)  L*.  We 
rediscover  the  previously  known  fact  that  their  composition,  L*L  is  a  filter,  meaning 
that  it  is  diagonalized  by  the  Fourier  transform.  Using  this  fact,  we  find  that  we 
are  dealing  with  an  operator  with  a  gigantic  null  space;  in  the  literature,  this  space 
is  known  as  the  cone  of  missing  information.  This  means  that  we  are  not  able  to 
directly  reconstruct  the  original  image  from  the  CT  camera  measurements.  That  is, 
there  is  information  that  we  lose  in  the  system,  and  we  cannot  get  it  back.  To  bypass 
this  issue,  we  now  realize  that  we  are  not  exploiting  everything  we  know  about  our 
circumstances:  we  want  to  reconstruct  images  of  natural  scenes,  and  such  images 
typically  are  sparse  or  have  sparse  gradients.  This  motivates  us  to  investigate  the 
most  recent  research  on  how  to  solve  linear  systems  with  large  null  spaces  subject  to 
the  constraint  that  the  solution  is  sparse  or  has  a  sparse  gradient.  We  now  consider 
each  of  these  points  in  greater  detail. 

A  hyperspectral  image  can  be  thought  of  as  a  continuous  three-dimensional 
data  cube,  namely  as  a  function,  f(x,  y,  z )  where  x  and  y  are  the  spatial  coordinates 
of  the  image  and  z  is  the  frequency  of  the  light.  Many  current  hyperspectral  imaging 
systems  fall  into  two  types.  In  the  first  type,  different  color  filters  are  placed  in 
front  of  the  camera.  Each  picture  taken  through  one  of  these  filters  corresponds 
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to  a  single  2-cross-section  of  f(x,y,z).  That  is,  such  a  system  measures  f(x,y,z), 
one  horizontal  cross-section  at  a  time.  A  second  type  of  system  uses  the  push  broom 
sensor  method,  which  passes  the  available  light  (of  all  frequencies)  through  a  vertical 
slit  and  then  through  a  horizontally-aligned  prism.  The  light  that  passes  through 
the  slit  corresponds  to  a  single  x-cross-section  of  f(x,y,z).  When  passed  through 
a  prism,  this  two-dimensional  cross-section  (one  spatial  dimension,  one  frequency 
dimension)  is  transformed  into  an  intensity  image  with  two  spatial  dimensions  which 
is  then  captured  with  a  conventional  (grayscale)  camera.  This  enables  us  to,  once 
again,  measure  the  original  data  cube  one  slice  at  a  time;  here,  x  is  fixed  while  y  and 
2  vary. 

Unfortunately,  both  of  these  methods  have  problems  with  capturing  transient 
events,  that  is,  events  that  occur  over  a  small  interval  of  time.  With  the  color-filter 
method,  we  may  see  the  event,  but  will  only  capture  it  in  a  few  color  bands.  With 
the  push  broom  method,  we  may  not  see  the  event  at  all,  and  even  if  we  do,  we  will 
only  measure  a  small  vertical  slice  of  it. 

In  this  thesis,  we  investigate  a  modified  version  of  the  push  broom  system  in 
which  the  slit  is  removed,  that  is,  all  of  the  available  light  is  allowed  into  the  camera, 
and  is  then  dispersed  by  the  prism.  Though  this  allows  more  information  to  enter  the 
camera,  it  has  the  unfortunate  side-effect  of  producing  a  horizontal  blurring  effect. 
We  can  model  such  a  system  with  the  following  linear  operator: 

OO 

( Lf)(x,y)=  I  f(x-z,y,z)dz.  (1) 

—  OO 

To  understand  this  formula,  consider  a  very  simple  hyperspectral  image  that  only  has 
three  color  channels:  red,  green,  and  blue  with  frequencies  —1,  0,  and  1,  respectively. 
The  operator  L  acts  as  an  adding  procedure  that  sums  the  number  of  photons  that 
accumulate  at  pixel  location  (x,  y)  in  the  intensity  image  produced  by  the  camera. 
In  this  case,  ( Lf)(x,y )  represents  the  number  of  green  photons  at  the  hyperspectral 
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image’s  (x,y)  coordinate,  along  with  the  number  of  red  photons  at  (x  +  l,y)  and 
the  number  of  blue  photons  at  ( x  —  1  ,y).  This  corresponds  to  the  actual  number 
of  photons  (of  any  color)  measured  by  our  grayscale  camera  at  pixel  location  (x,y), 
since  our  ideal  prism  allows  green  photons  to  pass  straight  through,  but  shifts  red 
and  blue  photons  one  unit  to  the  left  and  right,  respectively. 

Note  that  in  (1),  we  are  assuming  that  the  prism  is  oriented  horizontally, 
namely  at  a  0°  rotation  angle.  If  we  were  to  instead  rotate  the  prism  by  90°  counter¬ 
clockwise,  we  would  suffer  a  vertical  blurring,  and  we  would  instead  be  integrating 
f(x,y  —  z,  z)  over  all  frequencies  z. 

This  blurring — a  consequence  of  our  desire  to  allow  more  light  into  the  camera — 
also  unfortunately  leads  to  a  loss  of  information.  In  an  attempt  to  recover  this  miss¬ 
ing  information,  we  use  a  system  that  spins  the  prism.  That  is,  for  any  possible 
prism  angle  9,  we  produce  an  intensity  image  corresponding  to  a  “projection”  of  the 
original  hyperspectral  image  along  that  axis.  Mathematically,  we  model  this  process 
as  a  generalization  of  (1).  In  particular,  we  integrate  over  every  possible  ^-rotation 
of  the  lines  we  integrate  over  in  (1).  Thus,  the  measured  data  we  produce  is  of  the 
form  ( Lf)(x ,  y,  9),  where  x  and  y  are  spatial  coordinates  and  9  is  the  prism’s  rotation 
angle.  To  be  precise,  we  now  define  Lf  as 

OO 

( Lf)(x,y,9 )—  I  f(x  —  zcos9,y  —  zsm9,z)dz.  (2) 

—  OO 

In  the  literature,  such  an  angle-indexed  collection  of  line  integrals  is  known  as  a 
continuous  X-ray  transform.  Such  transforms  have  long  been  studied  in  the  held  of 
tomography.  Given  a  data  cube  f(x,y,z)  we  model  our  camera  measurements  as 

g{x,  y,  0)  =  (. Lf)(x ,  y,  9)  +  e(x,  y,  9).  (3) 
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Whereas  /  is  a  real- valued  function  over  M3,  g  is  a  real- valued  function  over  M2  x  T, 
where  T  :=  M/27tZ  is  the  set  of  all  possible  prism  angles.  In  order  to  build  a  useful 
CT  camera,  we  need  a  mathematical  method  for  reconstructing  /  from  g.  That  is, 
if  possible,  we  would  like  to  invert  the  X-ray  transform.  Unfortunately,  as  we  now 
discuss,  this  transform  is  not  invertible. 

Indeed,  note  that  due  to  the  noise  term  in  (3),  the  equation  Lf  =  g  may  not 
even  have  a  solution.  The  standard  way  to  get  around  this  issue  is  to  instead  consider 
the  linear  least  squares  version  of  this  equation.  That  is,  we  find  the  /  that  makes 
Lf  as  close  to  g  as  possible  in  the  L2  sense: 


argrnin  \\Lf  —  g\\% 

/eL2(R3) 


oo  oo  oo 


arg  mm 

/eL2(R3) 


I (Lf)(x,y,6)  —  g(x,  y,  9)\2  dx dy  dz. 


—  OO  — OO  — OO 


(4) 


To  be  clear,  the  argrnin  operator  returns  the  set  of  /’ s  for  which  || Lf  —  g\\ \  attains 
its  minimum  value.  This  is  not  to  be  confused  with  the  min  operator  which,  given 
the  same  argument,  would  return  the  minimum  value  of  || Lf  —  g\\\  over  all  /.  Here, 
we  are  making  the  (reasonable)  assumption  that  /  G  L2(M3),  g  e  L2(M2  x  T)  as  well 
as  the  (unreasonable,  probably  false)  assumption  that  the  operator  L  given  in  (2) 
is  a  well-defined  bounded  linear  operator  between  these  spaces.  It  is  this  lack  of 
rigor  that  motivates  our  finite- dimensional  CT  model  given  in  Chapters  III  and  IV; 
until  then,  we  shall  wave  our  hands.  If  L  was  a  bounded  linear  operator  between 
Hilbert  spaces,  the  standard  Hilbert  space  theory  of  linear  least  squares  gives  that 
the  solutions  to  the  minimization  problem  in  (4)  are  precisely  the  solutions  to  the 
normal  equations 

ULf  =  L*g,  (5) 

where  L*  :  L2(M3)  — >  L2(M2  x  T)  is  the  adjoint  of  L ,  namely  the  bounded  linear 
operator  that  satisfies  (f,L*g)  =  ( Lf,g ).  In  the  next  chapter  we  informally  show 
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that  L*  has  the  form 


2tt 

(L*g)(x,y,  z)  —  J  g(x  +  zcos9,y  +  zsin.9,9)  d9.  (6) 

o 

In  the  tomography  literature,  L*  is  known  as  back  projection.  To  illustrate  how  L* 
behaves,  first  consider  how  the  operator  L  treats  a  single  frequency  of  light  Zq  emitted 
from  a  specific  spatial  coordinate  (xo,Z/o)-  Recall  the  intensity  of  light  at  this  point 
is  denoted  f(xo,yo,zo).  Under  the  action  of  L,  the  value  of  /  plays  a  role  in  several 
values  of  ( Lf)(x ,  y,  9).  In  particular,  if  /  were  a  Dirac-h  function  based  at  (x0,  yo,  Zq), 
the  nonzero  values  of  Lf  would  trace  out  a  helix  in  M2  x  T.  This  helix  is  “centered” 
at  ( xo ,  yo)  £  1R2  and  has  radius  zq.  In  particular,  if  we  were  to  shine  a  fixed  laser  into 
our  camera,  the  rotation  of  the  prism  would  cause  this  point  to  trace  out  a  circle 
in  the  xy- plane  over  time.  Looking  at  (6),  we  see  that  the  adjoint  of  L  integrates  g 
over  this  helix  in  order  to  form  the  value  (L*g)(xo,yo,  Zq).  In  short,  (L*g)(xo,yo,  Zq) 
sums  those  values  of  g(x,y,9)  which,  if  g  were  of  the  form  Lf,  depend  on  the  value 
f(x o,yo,Zo).  In  previous  research,  L*  has  been  described  as  shifting- and- adding  the 
camera  measurements  [17]. 

Having  the  expressions  for  L  and  L*  given  in  (2)  and  (6),  respectively,  we 
return  to  the  normal  equations  (5).  The  right-hand  side  of  (5)  is  determined  by 
back-projecting  (shifting-and-adding)  the  camera  measurements  g.  Meanwhile,  the 
left-hand  side  of  (5)  is  determined  by  the  composition  of  L  and  L* .  In  the  next 
chapter,  we  informally  show  that  the  operator  L*L  is  of  the  form 


27 r  oo 


(■ L*Lf)(x,y,z )  = 


f(x  —  t  cos  9,y  —  t  sin  9,  z  +  t)  d  t  d  9. 


(7) 


0  — oo 


From  (7),  we  can  see  that  L*L  integrates  the  values  of  /  that  lie  on  the  surface 
of  a  cone.  To  clarify,  note  that  in  (7)  we  integrate  over  all  points  of  the  form 
(. x,y,z )  —  f(cos#,sin$,  —1).  For  any  9,  (cos  9,  sin  9,  —1)  is  a  point  on  the  unit  circle 
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in  the  z  =  —  1  plane.  As  t  varies,  the  values  f(cos#,  sin#,  —  1)  trace  out  a  straight 
line  passing  through  this  point  and  the  origin.  As  9  varies,  these  lines  trace  out  a 
cone  whose  vertex  lies  at  the  origin,  and  has  “slope”  1  with  respect  to  the  xy- plane. 
Subtracting  the  values  f(cos#,  sin#,  —1)  from  (x,y,  z )  essentially  translates  this  cone 
so  that  its  vertex  lies  at  ( x,y,z ).  That  is,  L*L  performs  a  rolling  average  of  the 
values  of  /  over  this  cone’s  surface.  In  the  image  and  signal  processing  literature, 
such  “rolling  average”  operators  are  known  as  filters.  Moreover,  it  is  well-known  that 
filters  are  best  understood  in  terms  of  the  (unitary)  Fourier  transform ,  namely  the 
operator,  E*  :  L2(M3)  — >  L2(M3),  defined  as 


OO  OO  OO 


(E*f)(a,  7)  :  = 


3-27ri (ax+Py+riffa  ^  dx  dy  dz 


— 00  — 00  — OO 


In  particular,  in  the  next  chapter,  we  will  show  that  in  the  Fourier  domain,  L*L 
becomes  a  pointwise  multiplication  operator: 


(E*L*Lf)(a,(3,1)  =  (E*f)(a,(3,1)x 


a2-\-/32— 72  7 

0, 


7 2  <  a2  +  /32, 

y2  >  a2  +  fi2. 


(8) 


From  (8),  we  can  see  that  the  operator  L*L  destroys  information,  and  we  cannot 
get  it  back.  In  particular,  in  the  Fourier  domain,  L*L  multiplies  the  values  of  E*  f 
by  positive  scalars  if  y2  <  a2  +  f3 2  and  by  zero  if  y2  >  a2  +  /32.  That  is,  the  null 
space  of  L*L  corresponds  to  those  functions  /  G  L2(M3)  whose  Fourier  transforms 
are  completely  supported  on  the  cone 


{(a,/3,y):y2>«2  +  /32}. 


(9) 


Moreover,  since  N(L*L)  =  N(L),  we  can  see  that  our  camera  system  is  not  one-to- 
one:  two  hypercubes  f\  and  /2  can  yield  the  same  camera  measurements  even  if  their 
Fourier  transforms  are  different,  provided  these  differences  only  occur  at  frequencies 
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that  lie  within  the  cone  (9).  In  the  tomography  literature,  this  null  space  is  known 
as  the  cone  of  missing  information. 

Since  V s  null  space  is  so  large,  we  are  forced  to  make  additional  assumptions 
about  /  in  order  to  have  any  chance  of  reconstructing  it  from  g  =  Lf  +  e.  The 
purpose  of  our  camera  system  is  to  observe  transient  events  occurring  in  natural 
scenes,  meaning  we  may  want  to  restrict  ourselves  to  a  class  of  /’ s  that  are  either 
sparse  or  have  sparse  gradients.  To  be  precise,  we  say  that  a  vector  is  sparse  if  it  has 
very  few  non-zero  entries.  An  example  of  a  natural  scene  that  is  sparse  would  be  a 
clear  night  sky  (a  nearly  black  background  with  a  few  stars),  or,  alternatively,  the 
fireball  of  an  explosion  seen  at  night.  That  is,  a  sparse  image  is  one  that  is  mostly 
black  with  only  a  few  spots  of  bright  color.  On  the  other  hand,  if  the  gradient  of  / 
is  sparse  this  means  the  image  can  be  broken  up  into  large  regions  where  V/  =  0, 
that  is,  regions  in  which  the  color  is  constant.  Some  might  describe  these  scenes  as 
“cartoon-like”  because  their  appearance  is  similar  to  paint  by  number. 

Mathematically,  sparsity  relates  to  a  zero  “norm,”  which  counts  the  number 
of  non-zero  entries  in  a  vector.  That  is,  in  order  to  find  the  sparsest  solution  to  the 
normal  equations,  we  would  like  to  solve 

argrnin  ||/||0  subject  to  L* Lf  =  L*g.  (10) 

/ 

Unfortunately  the  zero  “norm”  is  not  a  norm:  it  does  not  distribute  over  scalar 
multiplication.  More  importantly,  optimization  is  mostly  calculus-based,  meaning 
that  in  order  to  solve  an  optimization  problem  such  as  (10),  we  would  like  to  use  a 
norm  that  is  differentiable.  The  zero  “norm”  is  not  differentiable,  suggesting  that  we 
should  use  a  different  norm  to  solve  our  problem.  Recently,  researchers  have  begun 
to  use  the  1-norm  as  a  proxy  for  the  zero  “norm;”  the  1-norm  of  /  is  defined  as 


OO  OO  OO 


—  OO  — OO  — OO 


\f(x,y,z)\  dxdydz. 


Though  not  differentiable  everywhere,  this  norm  is  nevertheless  a  convex  function  of 
/  meaning  it  can  be  analyzed  with  the  classical  convexity-based  generalizations  of 
calculus-based  optimization.  Therefore,  instead  of  solving  (10)  we  try  to  solve 

argmin  \\f\\i  subject  to  L*Lf  =  L*g.  (11) 

/ 

It  turns  out  that,  even  with  this  relaxation,  finding  a  solution  to  (11)  is  too  difficult 
mathematically  because  of  the  size  of  L’s  null  space.  To  account  for  this  problem,  we 
apply  a  standard  trick,  namely  we  regularize  (11)  by  adding  a  term  that  penalizes 
large  values  of  || Lf  —  g |||.  That  is,  to  find  a  sparse  /  for  which  Lf  is  close  to  g,  we 
will  solve 

argmin  H/lli  +  ^\\Lf  -  g\\22,  (12) 

where  A  is  some  experimentally  chosen  weight.  On  the  other  hand,  to  find  an  /  with 
a  sparse  gradient  we  solve 

argmin  llV/Hi  +  ^\\Lf  -  g\\\.  (13) 

In  the  literature  ||V/||i  is  sometimes  known  as  the  total  variation  norm  of  /. 

In  practice,  we  would  attempt  to  solve  (12)  or  (13)  for  larger  and  larger  values 
of  A  and  use  the,  for  lack  of  a  better  term,  “eye  test”  to  find  the  best  solution.  Notice 
though,  that  as  A  — >  oo  in  (12)  and  (13),  the  1-norm  term  becomes  insignificant  and 
we  are  essentially  again  solving  the  ill-conditioned  problem  of  minimizing  \\Lf  —  g\\\. 
The  Bregman  method  is  a  recently  introduced,  yet  already  very  popular,  method  for 
bypassing  this  issue.  That  is,  it  provides  an  alternative,  more  numerically  stable 
approach  for  minimizing  ||/||i  or  ||V/||i  subject  to  the  constraint  that  L*Lf  =  L*g. 
We  in  particular  apply  the  most  recent  variant  of  this  algorithm,  known  as  the  Split 
Bregman  method.  This  is  an  iterative  process  in  which  we  “split”  our  optimization 
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problem  into  simpler  ones  that  are  easier  to  solve,  making  use  of  a  technique  known 
as  shrinkage  (soft-thresholding) . 

1 . 2  Outline 

This  introduction  has  served  to  build  up  our  intuition  about  the  CT  recon¬ 
struction  problem,  as  well  as  highlight  some  of  the  key  quantities  relating  to  it.  In 
Chapter  II,  we  delve  deeper  into  the  continuous  X-ray  transform  (Lf)(x,y,9),  and 
informally  derive  the  important  equations  given  above.  Then,  in  Chapter  III,  we  will 
reinvent  the  X-ray  transform  in  the  discrete  setting,  which  allows  us  to  replace  the 
hand  waving  of  Chapter  II  with  a  rigorous  analysis.  Whereas  the  continuous  model 
we  present  in  Chapter  II  makes  intuitive  sense  but  is  not  rigorous,  the  discrete  model 
in  Chapter  III  is  completely  rigorous,  but  is  less  faithful  to  the  real-world  physics 
of  our  camera  system.  In  Chapter  IV,  we  refine  this  discrete  model  so  as  to  make 
it  more  physically  realistic.  In  Chapter  V,  we  use  the  Split  Bregman  method  along 
with  the  ideas  of  Chapters  III  and  IV  to  perform  CT  reconstruction  experiments. 

1 . 3  Major  Contributions 

For  the  past  twenty  years,  several  teams  of  researchers  have  applied  principles 
from  tomography  in  order  to  attempt  to  build  better  hyperspectral  imagers  [1,3, 
6-8, 10, 13, 18].  Some  of  this  previous  work  inspired  our  derivations  involving  the 
operator  L  given  in  Chapter  II.  Previous  research  has  also  made  some  of  the  same 
connections  that  we  make:  recognizing  the  problem  is  tomographic  in  nature,  and 
using  Fourier  analysis  to  study  it  [1,8]. 

However,  much  of  the  mathematics  that  has  been  previously  applied  to  the 
CT  reconstruction  problem  is  not  rigorous  because  it  is  difficult  to  prove  results  in 
infinite- dimensional  spaces.  In  short,  the  integral  operators  that  have  been  previ¬ 
ously  applied  to  this  problem  are  not  well-defined,  bounded  linear  operators  between 
prescribed  vector  spaces.  We  take  the  novel  approach  of  attacking  this  problem  from 
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a  purely  discrete  perspective,  making  all  of  the  mathematical  derivations  rigorous 
and  correct.  That  is,  we  are  the  first  to  apply  recently  introduced  principles  of 
discrete  tomography  [2]  to  the  CT  reconstruction  problem.  This  is  the  first  major 
contribution  of  this  work. 

The  second  major  contribution  builds  on  the  results  of  our  first  contribution. 
Making  only  a  few  simplifying  assumptions — namely  that  our  continuous  function 
/  is  periodic  in  the  spatial  and  frequency  domains  and  that  it  has  been  sampled  so 
finely  that  it  is  effectively  piecewise  constant — we  show  that  the  continuous  X-ray 
transform  boils  down  to  a  slight  generalization  of  our  aforementioned  discrete  model. 
Realizing  this,  we  then  use  the  theory  stemming  from  our  first  contribution  to  gain 
a  more  realistic  understanding  of  a  CT  camera  system. 

Our  third  major  contribution  is  that  we  are  the  first  to  apply  a  recently 
introduced,  state-of-the-art  numerical  optimization  technique — the  Split  Bregman 
method — to  the  CT  reconstruction  problem.  Over  the  past  few  years,  this  method 
has  become  very  popular  in  the  sparse  image  processing  community.  However,  due 
to  its  newness  and  mathematical  sophistication,  there  are  still  many  reconstruction 
problems  to  which  it  has  not  yet  been  applied.  New  fast  reconstruction  algorithms 
such  as  the  Split  Bregman  method  are  needed  in  order  for  CT  research  to  move  for¬ 
ward,  since  only  then  will  researchers  be  able  to  perform  the  numerical  experimenta¬ 
tion  and  simulation  needed  to  address  important,  basic  questions  about  their  camera 
systems.  That  is,  reconstruction  algorithms  such  as  the  Split  Bregman  method  allow 
us  to  focus  on  the  real  issues  with  our  CT  system,  and  what  kinds  of  improvements 
are  necessary  in  order  to  make  the  system  function  better. 
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II.  A  continuous  mathematical  model  of  chromotomography 

In  this  chapter,  we  continue  to  discuss  the  continuous  X-ray  transform,  namely  our 
operator  L,  as  defined  in  (2).  In  particular,  in  this  chapter,  we  provide  the  important 
derivations  concerning  L  that  we  glossed  over  in  the  introduction.  As  we  shall  point 
out,  these  derivations  are  not  completely  rigorous  clue  to  the  mathematical  subtleties 
of  infinite-dimensional  spaces.  Nevertheless,  these  derivations  are  on  the  right  track. 
In  the  coming  chapters,  we  will  generalize  the  techniques  introduced  here  to  the 
discrete,  finite-dimensional  setting.  There,  they  become  completely  rigorous.  In 
order  to  use  the  Hilbert  space-based  theory  of  linear  least  squares,  we  assume  our 
continuous  hyperspectral  image  /  lies  in  the  Hilbert  space 


OO  OO  OO 


=  /: 


->• 


\f(x,  y,  z)\2  dxdydz  <  oo 


—  OO  — OO  — OO 


The  inner  product  on  this  space  is  defined  as 


OO  OO  OO 


(/i,/2>=  /  /  /  [fi(x,y,z)]*  f2(x,y,z)dxdydz, 


—  OO  — OO  — OO 


where  a*  denotes  the  complex  conjugate  of  any  real  or  complex  scalar  a.  As  with 
any  Hilbert  space,  this  inner  product  defines  a  norm:  ||/||  :=  \J (/,  /).  Now  recall 
the  operator  L  defined  in  (2),  which  integrates  /  over  a  set  of  lines  in  M3: 


(■ Lf)(x,y,0 ) 


OO 

j  f(x  —  zcosO,  y  —  z  sin  9,  z)  dz. 


—  OO 


Since  /  lies  in  L2(R3)  instead  of  L1(M3),  we  have  no  guarantees  that  such  line  integrals 
even  exist  (converge).  Putting  this  issue  aside,  note  that  in  order  to  use  the  standard 
theory  of  linear  least  squares,  we  will  also  want  the  output  of  the  operator  L  to  lie 
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in  a  Hilbert  space.  Since  Lf  is  a  function  over  M2  x  T,  the  natural  space  is 


2n  oo  oo 


L2(W  x  T)  =  \  g  :  M2  x  T  -G 


\g(x,y,9)\ 2  dxdydO  <  oo  L 


0  —  oo  — oo 


where  the  inner  product  is  defined  as  usual.  Here,  in  order  to  use  least  squares, 
we  need  to  go  beyond  our  original  assumption  that  Lf  is  a  well-defined  function  of 
i2xT  and  make  the  further  assumptions  that  Lf  G  L2(M2  x  T)  and  moreover,  that 
the  operator  L  is  a  bounded  linear  operator,  that  is,  L  is  a  linear  operator  for  which 
there  exists  some  constant  B  such  that  ||L/||  <  5||/||  for  all  /  G  L2(M3). 

Making  this  assumption,  we  would  like  to  find  the  /  that  solves  the  equation 
Lf  =  g  for  a  given  g.  However,  the  presence  of  noise  in  g  makes  it  likely  that  this 
equation  will  not  have  a  solution.  As  such,  we  instead  attempt  to  solve 


arg min  \\Lf  —  g\\\.  (14) 

f£L2(R3) 

That  is,  we  want  to  find  the  /  that  minimizes  the  integral  of  the  squares  of  the  point- 
wise  errors  ( Lf)(x ,  y,  9)—g{x ,  y,  6).  If  we  make  another  oversimplifying  assumption — 
that  the  range  of  L  is  a  closed  subspace  of  L2(M2  x  T) — classical  Hilbert  space  theory 
implies  that  the  solutions  to  (14)  are  equivalent  to  the  solutions  to  the  normal  equa¬ 
tions  L*Lf  =  L*g. 

The  normal  equations  make  use  of  the  operator  L*  :  L2(M2  x  T)  — »  L2(M3), 
that  is  the  adjoint  of  L,  which  is  characterized  by  the  fact  that  (/,  L*g )  =  ( Lf ,  g) 
for  all  /  G  L2(M3)  and  g  G  L2(M2  x  T).  To  see  how  to  informally  derive  the  formula 
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for  L*  given  in  (6),  consider 


(/>  L*g)  —  (Lf,g) 

27T  OO  OO 

=  //  /  [Lf{x,y,0)}*  g(x,y,0)dxdyd0 

0  — OO  — OO 

OO 

j  f(x  —  zcos9,y  —  zsind,  z)  d z 


g(x,  y ,  9)  dx  d y  d 9. 


To  proceed,  we  now  distribute  the  complex  conjugate  and  then  interchange  integrals 
(without  justification)  yielding 


2tt  oo  oo  oo 


(/,  L'g) 


0  — oo  — oo  — oo 


—  z  cos  9,y  —  z  sin  9,  z)]*g(x,  y,  9)  d z  dx  dy  d 9. 


For  any  fixed  9,  making  the  substitution  (u,  v,  w)  =  (x  —  z  cos  9,y  —  z  sin  9,  z)  trans¬ 
forms  this  expression  for  (/,  L*g )  into 

2lT  OO  OO  OO 

[f(u,  v,  w)]*  g(u  +  w  cos  9,  u  +  rasing,  9)  J(w,  v,  w)  dtc  du  du  d 9. 

0  — oo  — oo  — oo 


Here,  J (u,v,w)  is  the  Jacobian  of  this  change  of  variables,  namely  the  determinant 
of  a  3  x  3  matrix,  which  happens  to  be  upper  triangular  in  this  case: 


dx 

dx 

dx 

0 

—  cos  9 

du 

dv 

dw 

1 

J(u,  V,  w)  = 

dy 

du 

dy 

dv 

dy 

dw 

= 

0 

1 

—  sin  9 

=  (1)(1)(1)  =  1 

dz 

dz 

9z_ 

0 

0 

1 

du 

dv 

dw 

Therefore,  our  expression  for  (/,  L*g)  becomes 


(/,  L'g) 


2n  oo  oo  oo 

[f(u,  v,  w)]*g(u  +  w  cos  9,  u  +  w  sin  9,  9)  die  du  du  c 19. 

0  — oo  — oo  — oo 
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Once  again,  we  interchange  integrals  (without  justification)  and  find 


OO  OO  OO 


(f,L*g)  =  [f(u,v,w)]* 


—  OO  — OO  — OO 


2tt 


g(u  +  w  cos  9,  u  +  w  sin  9 ,  6)  d 9 


die  d u  du. 


That  is,  we  necessarily  have  that,  as  given  in  (6),  L*  has  the  form 

2t r 

( L*g )  ( x ,  y,  z)  =  J  g(x  +  z  cos  9,y  +  z  sin  9,  9)  d 9. 
o 

In  words,  for  each  (x,  y,  z )  triple,  L*  integrates  g  over  a  helix.  As  noted  above,  L*  is 
commonly  referred  to  as  a  back  projection  or  as  the  shift-and-add  method. 

Returning  to  our  goal  of  solving  the  normal  equations  L*Lf  =  L*g ,  we  see  that 
in  order  to  reconstruct  /,  we  first  shift  and  add  g  and  then  attempt  to  invert  the 
operator  L*L  :  L2(M3)  — y  L2(M3).  To  do  this,  we  now  derive  L*L  from  (2)  and  (6): 


2n 


(. L*Lf )  (x,  y,  z)  =  /  ( Lf )  (x  +  z  cos  9,y  +  z  sin  9,  9)  d 9 


o 

2n  oo 


f(x  +  z  cos  9  —  w  cos  9,y  +  z  sin  9  —  w  sin  9 ,  w)  dw  d9 


0  — oo 
2n  oo 


f(x  +  (z  —  w)  cos  9,  y  +  (z  —  w)  sin  9 ,  w)  dtc  d 9. 


0  — oo 


For  any  fixed  9 ,  making  the  substitution  t  =  w  —  z  gives 


2n  oo 

(L*Lf)(x,y,  z)  —  j  j  f(x  —  tcos9,y  —  tsm9,z  +  t)dtd9. 

0  — oo 


We  now  notice  that  L*L  is  an  example  of  a  well-studied  class  of  linear  operators, 
namely  filters.  Filters  arise  from  translations,  which  are  operators  that  shift  a  func¬ 
tion’s  input  by  a  certain  amount.  In  L*L,  the  x,  y,  and  z  inputs  are  shifted  by  t  cos  9 , 
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t  sin  0,  and  —t,  respectively.  It  is  well-known  that  such  hlters  are  diagonalized  by 
the  (continuous)  Fourier  transform,  E*  :  L2(M3)  — >  L2(M3),  defined  as 


OO  OO  OO 


(£*/)(«,  ft  7)  :  = 


r2m(ax+flv+yz)f(x,  ^  ^  fa  dy 


— oo  — OO  — OO 


The  Fourier  transform  is  a  unitary  operator,  meaning  that  E*  =  E  1 ,  where  E  : 
L2(M3)  — y  L2(M3)  is  the  Inverse  Fourier  Transform,  defined  as 


OO  OO  OO 


( Ef){x,y,z )  := 


^ax+gy+jz)f^  ^  ^  da  dp  d7_ 


— oo  — oo  — oo 


To  see  how  the  Fourier  transform  plays  a  role  here,  we  now  apply  it  to  L*Lf : 


(E*L*Lf)(a,  f3, 7) 

e-2^(ax+gy+1z)  (L*Lf^  (Xj  yt  2)  dx  dy  d^ 


00  00  00 


—00  —00  —00 
00  00  00 


—00  —00  —00 


2-7T  OO 

e-2m(ax+Py+'yz)  J  J  f  (x  —  f  cos  Q  ^  y  —  f  s[nQ  ^  z  - f  £)  dtdOdxdydz. 
0  —00 


Again  interchanging  integrals  (without  justification)  this  becomes 


2n  00  00  00  00 

e-2m(ax+py+r/z)  j ^  cos  q  y  _ -f-  sjn  ^  z  +  f)  dx  dy  dz  d t  d 6. 

0  —00  —00  —00  —00 

At  this  point,  for  any  fixed  t  and  9 ,  we  make  the  change  of  variables  ( u,v,w )  = 
(x  —  tcos9,y  —  t  sin  9,z  +  t).  It  turns  out  that  the  J (u,v,w)  for  this  change  of 
variables  is  1,  and  so  our  expression  for  (E*L*Lf)(a,  {3, 7)  becomes 


2n  00  00  00  00 

e-2ni(a(u+tcose)+P(v+tsind)+'y(w-t))j^  ^  ^  du  d^  dw  df  dQ 

0  —00  —00  —00  —00 
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Once  again,  we  switch  the  order  of  integration  and  our  expression  becomes 


2n  oo 


oo  oo  oo 


2^atcosd+ptSin0-1t)dtde  /  /  /  f{u,v,w)dudvdw 


0  — oo 


— oo  — oo  — oo 


27T  OO 


-2ni(atcoS0+ptsin0+-y(-t))dt  dQ  ( E*  f)(a ,  /?,  y) 


,  0  — oo 


=  A(a,/3,7)(£*/)(a,A7), 


where  for  any  (a, /3, 7)  G  M3,  the  scalar  A(a, /?,  7)  is  defined  as 


A(a,/3,7) 


2tt  00 


-27ri(at  cos  Q+(3t  smd-\-^(—t))  ^ 


0  —00 


That  is,  we  have  (E*L*Lf)(a,  /3, 7)  =  A(a,  /?,  ^(E*  f)(a,  (3, 7),  meaning  that  in  the 
Fourier  domain,  acts  as  a  pointwise  multiplication  operator.  In  essence,  for 
any  (a,  (3, 7)  G  M3,  we  have  that  the  function  ea^^(x,y,  z)  :=  e27n(ax+Py+'yz)  js  an 
eigenfunction  for  L*L  with  corresponding  eigenvalue  A(a,/3,7).  To  understand  this 
better,  we  now  further  investigate  A(a,/3,7).  In  particular,  we  write  any  (a,  (3)  G  M2 
in  polar  form  as 

(ot,P)  =  V a2  +  f32  (  a  ,  f  )  =  \/ a2  +  /32(cos  0a>/3,  sin  0a>/3). 

Ya/o;2  +  P2  A/a2  +  p2y 
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This  allows  us  to  make  the  following  observation: 


2tt  oo 


A  (a,  0,7)=  /  / 


0  —00 
OO  2-7T 


W  f 


.  cos  0H — ,  P  sin  0  1 
+^2  7^5  ;  cW  dt 


=  fe  J 

—00  0 

OO  2-7T 


=  Je  J 

—00  0 


2-7ri7t  /  ^-2^7/ a2+(32t (cos  <£a?/3  cos  0+sin  0a>/3  sin  ^  ^ 


Using  the  angle  sum  formula  for  cosine,  we  find 


00  27 r 

^ni'yt  f  -2Triy/a2+l32tcos^e-(patp'j 


A(a,/3,7)  =  J  e^-'  j  e 

— 00  0 

OO  7T 

_  f  e2m-yt  j  e-27ri 7 a2+/32t  cos  0  d(£)  (p 


dd  dt 


—  OO  —  7T 

OO 


=  2./e' . ,/e 

— 00  0 


27ri7t  /  —  ‘Z'KiyJ  a2+/32t  cos  0 


dd  dt. 


Making  the  substitution  <f>  =  d  —  f  transforms  this  expression  into 


A(a,  /3, 7)  =  2  [  e2^  [  e-27riV^+^c°s(<Hf )  d0  dt 


2  /"  e27riT*  f  e-27ri7a2+/32hcoS(/>cos  f-sin</>sin  f  )  ^0  ^ 


=  2  /  e 


27ri7t  r  27ri  yfoP+ffit  sin  0 


d(/>  dt. 
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At  this  point,  making  the  substitution  u  =  sin  (f>,  our  expression  becomes 


\(a,  p,  7)  =  2  f  e2nht  [  e™Va2+P2t*—l 


'  —  OO 
OO 


=  2  /  e 


/_  1  a/1  —  u2 

2n\~ft  f  27ri^/ a2+/32tu  (  1^1)  wO 


dw  dt 


du  dt. 


■u- 


—00  —  OO 


Letting  v  =  —U\J a2  +  f32  transforms  this  expression  into 


A(a,/3,7) 


du  dt 


—  OO  — OO 


Recognizing  this  expression  to  be  a  composition  of  a  one- dimensional  Fourier  trans¬ 
form  with  its  inverse,  our  expression  becomes 


A(a,/3,7) 


21(  1,1}  (yfikp) 

\J  a1  +  f32  —  72 


\/ a2+/32— 72  ’ 

0, 


72  <  a2  +  /32, 
72  >  a2  +  /32. 


Combining  this  fact  with  our  earlier  observation  gives 


\J o2+/32— 72  ’ 

0, 


72  <  a2  +  f3 2, 
72  >  a2  +  /32. 


(15) 


Note  that  A(a,  /?,  7)  =  0  whenever  72  >  a2  +  f32.  This  means  annihilates  all  the 
frequency  information  of  /  that  lives  in  the  cone  {(a, /?, 7)  :  72  >  a2  +  (32}.  This 
is  why  this  cone  is  called  the  “cone  of  missing  information.”  This  means  that  even 
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if  all  the  assumptions  we  made  in  this  chapter  were  true,  in  order  to  reconstruct  / 
from  g  we  would  still  be  faced  with  solving  a  linear  system  L*Lf  =  L*g  where  the 
null  space  of  L*L  is  gigantic.  This  means  that  in  order  to  uniquely  solve  our  system, 
we  must  make  additional  assumptions  about  /  (for  example,  that  /  is  sparse  or  its 
gradient  is  sparse),  and  then  reconstruct  according  to  these  assumptions.  In  practice, 
such  restrictions  on  /  are  usually  enforced  by  regularizing  our  least  squares  problem 
and  then  performing  the  reconstruction  according  to  some  numerical  optimization 
scheme.  Since,  at  this  point,  all  of  our  computations  will  necessarily  become  finite¬ 
dimensional,  we  are  led  to  first  investigate  how  the  above  derivations  generalize  to 
that  setting.  As  we  shall  see,  in  our  finite-dimensional  CT  model,  we  can  rigorously 
address  all  of  the  technical  issues  that  we  oversimplified  in  this  chapter.  We  now 
turn  to  a  discrete,  finite-dimensional  model  that  will  enable  us  to  rigorously  address 
all  of  the  issues  that  were  glossed  over  here. 
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III.  A  discrete  mathematical  model  of  chromotomography 

Thus  far,  we  have  modeled  our  CT-based  hyperspectral  imaging  system  in  terms 
of  line  integrals  of  functions  over  the  continuum.  However,  as  seen  in  the  previous 
chapter,  most  of  our  derivations  in  these  continuous  models  are  not  mathematically 
rigorous.  In  order  to  increase  our  understanding  of  our  camera  system  and  to  make 
our  arguments  completely  legitimate,  we  now  discretize  our  hyperspectral  image  /, 
our  operator  L ,  and  the  surrounding  theory. 

Note  that  in  the  continuous  model,  in  order  to  understand  the  null  space  of 
L*L  (and  thus  its  noninvertibility),  we  eventually  made  use  of  Fourier  transforms. 
Therefore,  we  want  a  function  space  over  which  a  Fourier  transform  can  be  defined. 
Though  many  such  spaces  exist,  only  a  select  few  of  these  are  finite-dimensional, 
as  needed  in  order  to  rigorously  derive  our  CT  theory  using  as  little  functional  and 
harmonic  analysis  as  possible.  In  particular,  since  our  hyperspectral  image  f  will,  in 
this  finite- dimensional  model,  become  a  discrete  three-dimensional  data  cube,  we  use 
the  three-dimensional  Discrete  Fourier  Transform  (DFT).  This  transform  is  defined 
over  spaces  of  scalar- valued  functions  of  Z3  :=  Z  x  Z  x  Z  that  are  periodic  in  each 
direction.  To  be  precise,  for  any  positive  integer  P,  we  consider  the  space 


£(Z  3P) 


( 


{  f  :  Z3  C 


f[m,  n,p]  = 


f  [m  +  P,  n,  p] 

f  [m,  n  +  P,p\  ,  Vm,  n,pe  Z  >  . 
f  [m,  n,p  +  P] 


It  is  well-known  that  f'(Zp)  space  is  a  Hilbert  space  under  the  inner  product 


(fi,f2>:=  5Z(fiKn,p])*f2h,n,p]. 

mGZp  nEZp  pEZp 

Here  and  throughout,  summing  over  all  “m  G  Zp”  means  to  sum  over  a  set  of  coset 
representatives  of  the  subgroup  PZ  of  the  integers  Z.  For  example,  one  choice  of 
coset  representatives  is  to  take  m  =  0, ...,  P  —  1.  Note  this  space  has  dimension  P3, 
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with  the  only  independent  quantities  of  f  being  the  values  f  [m,n,p\  for  m,n,p  = 
0, 1, . . . ,  P  —  1.  Recall  that  in  our  continuous  model,  light  at  frequency  z  —  0  passes 
directly  through  the  prism.  Also,  when  the  prism  orientation  angle  9  is  0,  light  at 
frequencies  z  —  1  and  z  =  —  1  is  shifted  one  unit  to  the  right  and  left,  respectively. 
In  order  to  keep  this  same  intuition  in  the  discrete  setting,  it  is  helpful  to  regard  0 
as  the  center  of  Zp.  That  is,  when  P  is  even,  we  take  the  coordinates  of  f  to  range 
from  ^  2  t°  2  —  1 '  Meanwhile  when  P  is  odd,  we  take  them  from  —  t°  ^vr1- 

As  seen  in  (2),  the  continuous  X-ray  transform  of  /  G  L2(M3)  at  a  given  triple 
(x,y,9)  6  l2  x  T  is  obtained  by  integrating  f(x  —  zcos9,y  —  zsm9,z)  over  every 
frequency  In  the  discrete  setting,  the  input  for  our  operator  L  will  be  a  triple 

[m,  n,  q]  where  m  and  n  are  the  horizontal  and  vertical  pixel  indices,  respectively, 
and  q  represents  a  discrete  angle  parameter.  Specifically,  given  some  positive  integer 
Q ,  let  ^1,^2  ^  ^(Zq,  Z)  be  two  Q-periodic  integer-valued  functions  defined  over  Z. 
These  functions  will  play  the  role  of  discrete  cosines  and  sines:  below,  when  we 
discretize  our  continuous  X-ray  transform  (2),  we  will  replace  the  terms  z  cos  6  and 
z sin 9  with  pipi[q\  and  pip2 [q] ,  respectively. 

For  example,  if  our  camera  system  used  Q  —  4  angles,  we  may  use  the  four 
evenly-spaced  pairs  of  integers,  namely 

(^[0],^2[0])  =  (1,0), 

(^i[l],^2[l])  =  (0,1), 

(16) 

(^[2],^2[2])  =  (-l,0), 

("01- [3],  ^2  [3])  =  (0,-1). 

If  we  wanted  to  use  Q  —  8  angles,  we  could  use  the  “knight’s  moves”  set: 


{(2, 1),  (1, 2),  (-1, 2).  (-2, 1),  (-2,  -1),  (-1,  -2).  (1,  -2),  (2,  -1)}. 


(V) 


These  8  points  are  depicted  in  Figure  1.  We  can  now  form  a  discrete  operator  L 
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6 


Figure  1:  Here  we  portray  the  8  symmetric  discrete  “angles”  known  as  the  “knight’s  moves”  set,  as  given  in  (17). 
For  this  figure,  we  have  chosen  P  =  13. 


which  generalizes  (2): 

(Lf )[m,  n,q\  =  ^  f  [m  -  pipijq],  n  -  p02[g],p]-  (18) 

That  is,  instead  of  integrating  over  all  frequencies  z,  we  now  sum  over  a  finite  number 
of  frequencies  p  G  7Lp.  Note  that  since  f  e  £(Zp)  and  ipi,ip2  e  0Zq,Z),  then  Lf  is 
(P,  P,  Q)-periodic.  Indeed,  the  P-periodicity  of  f  gives 

(Lf )  [m  +  P,  n,  q]  =  ^  f  [m  +  P  —  pipi  [q] ,  n  -  p^2 [9] ,  p] 

p£Zp 

=  ^  i  [m-  pip  1  [g] ,  n  -  pip2  [q] ,  p] 

p&jp 

=  (Lf)[m,n,g]. 

The  fact  that  (Lf)[m,  n  +  P,  g]  =  (Lf)[m,  n,  q]  follows  similarly.  Finally,  the  Q- 
periodicity  of  Pi  and  ip2  gives 

(Lf)[m,rc,g  +  Q]  =  ^  f[m  -  p-0i[g  +  <2],  n  -  p02[g  +  Q\,p] 

p&Lp 

=  X]  f  ['m  -  P'01  [?] ,  n  -  pip 2  [q] ,  p] 

p&Lp 

=  (Lf)[m,  n,  g]. 
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That  is.  Lf  lies  in  the  Hilbert  space 


x  ^q)  —  \  g  :  Z>3  — >■ 


g  [m,n,q]  = 


g  [m  +  P,n,q ] 

g [m,  n  +  P,  q]  ,  Vm,  n,q  G  Z  >  . 
g[m,n,g  +  Q] 


One  can  immediately  see  that  L  :  t(Zp)  — >  ^(Zp  x  Zq)  is  linear.  Also,  since  t'(Zp)  is 
finite-dimensional,  this  linear  operator  is  automatically  bounded.  Moreover,  since  L 
is  bounded  its  adjoint  exists.  As  such,  the  technicalities  we  encountered  in  Chapter  II 
concerning  whether  or  not  L  and  L*  are  well-defined  do  not  arise  here. 

Having  that  L*  exists,  we  now  parallel  the  derivation  in  the  continuous  setting 
in  order  to  find  an  explicit  formula  for  it.  Throughout  the  following  calculations, 
summations  over  the  variables  m,  n,  p  range  over  Zp,  while  those  over  the  variable  q 
range  over  Z q.  For  any  f  G  £(Zf>)  and  any  g  G  £(Zp  x  Zq): 


(f,  L*g)  =  (Lf,  g) 

m  n  q 

=  -  fei  ’ n  -  p^2  m  ’  pi y * §  k 

m  n  q  p 

At  this  point  in  Chapter  II,  we  interchanged  integrals  without  justification.  Here 
however,  our  sums  are  finite,  and  so  we  can  interchange  them  freely: 

(f.  L*g)  =  -  pyJi[q],n  -  pip2[q],p\)*g[m,  n,  q}. 

q  p  m  n 
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For  any  p  and  g,  we  now  substitute  ( m',n ')  for  (m  —p/ipi[q\,n  —  p0i[g]): 

n',p\y  g  [m'  +  P'01  [9] ,  n'  +  P'02  [g] ,  g] 

<7  p  m'  n’ 

g  [m'  +  p0i  [g] ,  n'  +  p02  [g] ,  g]  • 

m'  n'  p  q 

That  is,  the  adjoint  of  L  is 

(L*g)  [m,n,p]  =  E  g[m  +  p0i  [g],  n+ p-02  [g],g].  (19) 

Note  the  similarities  between  (6)  and  (19);  the  L*  given  here  adds  up  the  values  of 
g  over  a  discrete  “helix”  centered  at  [m,  n]  with  radius  p. 

At  this  point,  recall  that  we  want  L*  in  order  to  help  us  solve  the  least  squares 
problem  argminf  ||Lf  —  g|||.  Since  L  is  a  bounded  linear  operator  between  Hilbert 
spaces,  this  problem  is  equivalent  to  solving  the  normal  equations  L*Lf  =  L*g. 
Moreover,  these  equations  will  have  a  solution  provided  the  range  of  L  is  closed.  In 
the  continuous  setting,  this  was  yet  another  assumption  we  made  in  order  to  proceed. 
Here  however,  we  need  not  make  such  an  assumption  since  it  is  automatically  true: 
the  range  of  L  is  a  subspace  of  the  finite- dimensional  space  £(Z2P  x  Zq)  and  so  is 
necessarily  closed.  Thus,  we  can  indeed  minimize  || Lf  —  g|| |  by  solving  L*Lf  =  L*g. 

That  is,  given  camera  data  g  G  £( Zp  x  Zq),  our  first  step  is  to  compute  L*g 
according  to  (19).  We  then  attempt  to  solve  the  normal  equations.  To  facilitate  this, 
we  now  derive  an  explicit  formula  for  L*L  from  (18)  and  (19): 

(L*Lf )[m,n,p\  =  ^ (Lf ) [m  +  p-0i [g] ,  n  +  p02 [g] ,  g] 

<? 

=  EE  f  [m  +  p0i  [g]  -  p'0 1  [g] ,  n  +  p02  [g]  -  pV 2  [g] ,  p']  ■ 

q  p1 
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For  any  fixed  p,  letting  r  =  p'  —  p  gives 


(L*Lf)[m,  n,p]  =  EE  f  [m  —  ripi  [9] ,  n  —  2  [q],p  +  r] .  (20) 

ijGZq  rGZp 

In  Chapter  II,  we  informally  discussed  how  our  continuous  operator  L*L  was 
a  filter.  Now  that  we  are  in  the  discrete  setting,  we  can  make  this  notion  rigorous 
by  using  the  theory  of  convolutions.  Convolution  is  a  type  of  vector-vector  prod¬ 
uct:  it  takes  any  two  functions  and  produces  a  third.  To  be  precise,  we  define  the 
convolution  of  fi,f2  G  £(h 3P)  as 

(fi  *  f2)[m,  n,p]  —  fi[m',  n/,p/]f2[m  —  m',  n  —  n',p  —  p'].  (21) 

m'GZp  n'GZp  p'GZp 

This  convolutional  product  is  well-known,  having  many  nice  properties.  In  partic¬ 
ular,  under  standard  addition  and  this  notion  of  multiplication,  it  turns  out  that 
£(Z p)  is  a  commutative  ring  with  identity  S (0,0,0)  ■  Here,  for  any  mo,no,Po  G  Z,  the 
Dirac-6  function  at  (mo,no,po)  is  6(m0tTl0tP 0)  G  £(Zp), 

I  1,  m  =  mo,  n  =  n0,  p  =  p0,  mod  P, 

O(m0,n0,p0)  I'M',  n,  p\  .=  < 

I  0,  else. 

One  way  to  think  about  the  convolution  of  fi  and  f2  is  that  it  is  a  “rolling 
average”  of  the  values  of  fi  with  coefficients  from  f2.  As  we  now  show,  this  is 
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precisely  what  our  operator  L*L  does;  we  rewrite  (20)  as 


(L*Lf)[m,  n,p]  =  f  [m  —  rrij>i[q],n  —  rip 2 [<?] , P  —  (— r)] 

q  r 

—  'y  ^  "y  ^{^(rtl!i\q],rip2[q],-r)  *  f )[m,Tl,p\ 

q  r 

EE  &r(il>i [g] ,^2 [<?],- 1)  j  *  f  \jTl^n^p\ 

\  q  r  / 

5^0  *f [m,n,p\ 


where,  for  any  q  G  Z,  hg  G  £(Zp)  is  a  “characteristic  function”  of  the  discrete  “line” 
that  consists  of  all  integer  multiples  of  ('^i[q\,'^2[o\,  —  1): 


hq.  Sr^ip!  [q],y2 [<?],—  !)• 

r£Zp 

We  can  now  express  L*L  as 

L*Lf  =  h*f, 

where  h  G  f'(Zp)  represents  the  “surface”  of  a  discrete  “cone”: 


(22) 


(23) 


h [m,n,p]  :=  h,[m,n,p]  =  E  (24) 

qeZq  geZQ  r£ZP 

That  is,  at  any  (■ m,n,p )  G  Z3,  L*Lf  represents  an  “average”  of  those  values  of  f  that 
lie  on  any  one  of  the  Q  discrete  “lines”  of  all  integer  multiples  of  ('ipi[q\,'ip2[q\,  —  1). 
For  example,  when  Q  —  8  and  tpi,ip2  correspond  to  the  knight’s  moves  of  (17), 
(L*Lf )[m,n,p\  adds  up  the  values  of  f  over  the  surface  of  a  “cone”  consisting  of  8 
“lines”  radiating  outward  from  (■ m,n,p ). 

Having  that  L*L  is  a  filter,  namely,  the  convolutional  operator  L*Lf  =  h  *  f , 
we  can  now  better  understand  it  using  Fourier  transforms.  In  particular,  since  our 
model  is  now  discrete,  we  use  the  Discrete  Fourier  Transform  (DFT),  namely  the 
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operator  E*  :  £(Z3P)  — *  £(Zp),  defined  as 


(E-f)[a,/3,T]:=  V  V  ^  e-¥(»™+^P)f[ 

mGZp  nGZp  pEZp 

It  is  well-known  that  the  DFT  is  a  scalar  multiple  of  a  unitary  operator,  meaning 
its  inverse  is  a  scalar  multiple  of  its  adjoint.  In  particular,  (E*)-1  =  -^-E,  where 
E  :  i(7?p)  -7  £(Z 3P), 

(Ef)[m,n,p]=  V  V  ^e¥(«M[|a,A7|. 

o:GZp  /3EZp  'y EZp 

It  is  also  well-known  that  the  DFT  of  a  discrete  convolution  (21)  is  a  scalar  multiple 
of  the  pointwise  product  of  their  individual  DFTs: 

(E*^*^))^,/^]  =  (E  %)[a,j9,j](E%)[a,j9,j]. 

Applying  this  fact  to  L*Lf  =  h  *  f  gives 

(E*L*Lf  )[a,  /3, 7]  =  (E*h)[a,/3,7]  (E*f)  [a, (3, 7]. 

Letting  A  =  (E*h)  allows  us  to  express  the  DFT  of  LfiLf  as 

(E*L*Lf)  [a,  0, 7]  =  A[a,  /3, 7]  (E*f )  [a,  {3, 7] . 

That  is,  just  like  in  the  continuous  setting,  L*L  acts  as  a  pointwise  multiplica¬ 
tion  operator  in  the  Fourier  domain.  Here,  for  any  (a,/?,  7)  G  Z3,  ea>ig)7  G  f'(Zp), 
ea  /g  7[m,  n, p]  =  e^^Qm+/3n+7pl  is  an  eigenfunction  for  L*L  with  corresponding  eigen¬ 
value  A[a, /?,  7].  This  is  equivalent  to  us  having  L*L  =  EAE*  where  A  :  £(Zp)  — > 
£(Z p)  is  pointwise  multiplication  by  A.  To  better  understand  the  ramifications  of 


this  fact,  we  now  obtain  a  simplified  expression  for  A[a,/3,7]: 


A[a,j3,7]  =  (E*h)[a,  fi,y\ 

=  (E*£h,)[a,/3,7] 

=  ^(E’h,)[a,A7]. 

<? 

From  here  we  will  focus  on  calculating  Ag[a,/3, 7]  =  (E*h9)[a,  /3, 7]: 


Aq  [cr,  /3,  7]  [9], ^>2  [9],— 1)  I  |p>  Pi  t] 

=  EEE  e“¥(Qm+/3"+7P)  E  *r<*W>*W,-i)[m,  n,p]. 


m  n  p 


To  proceed,  we  interchange  sums: 


Aja,/3,7]  =  E  E  E  Ee  2”(am+^n+7p)<5rW,i [qU2[q]-1)[m,n,p\ 


r  m  n  p 


—  \  e-^(o:r^ik]+/5r^2[g]+7(-0) 


E' 

r 

£[■ 


(1  -  2p‘ (a,;'l  h/l  +  M+7(-l)) 


Using  the  Geometric  Sum  Formula,  we  end  up  with 


\[a,/3, 7' 


{P,  onlh[q\  +  /^[g]  -7  =  0  mod  P, 
0,  else, 


{1,  7  =  m/h[g]  +  /%[<?]  mod  P, 

0,  else. 


(25) 
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Note  that  the  multiplier  of  P  in  (25)  is  the  characteristic  function  of  the  plane  in  Zp 
which  is  “perpendicular”  to  (ifi[q],  —  1)-  We  now  summarize  the  main  results 

of  this  chapter  in  the  following  theorem: 

Theorem  1.  Let  L  :  £(ZP)  —>  £(Z p  x  Z q)  be  the  discrete  X-ray  transform: 

(Lf )[m,n,q\  :=  ^  i[m  -  pfji[q],n  -  pfj2[q},p)- 

pGZp 

The  adjoint  of  L  is  L*  :  £(Z2P  x  Z q)  — >  £(ZP), 

(L*g )[m,n,p\  =  ^  +  T0i [?],«+  P'02 [q],q] ■ 

Moreover,  L*L  :  £{Z3P)  — >  £(ZP  x  Z q),  is  a  filter,  with  L*Lf  =  h  *  f,  where 
h [m,n,p\  =  5’i^[qU2[qi-i)[m,n,p\. 

q&ZQ r£ZP 

As  such,  the  DFT  o/L*Lf  is  (E*L*Lf)[a:,  (3, 7]  =  A[a,  (3, 7](E*f)[a,  (3, 7],  where 
A[a,/3,7]  =  5^(E*hq)[o!,/3,7] 

<?6Zq 

_  p  ^  I  1,  7  =  aipi[q]  +  P&[q]  mod  P, 
q&^Q  I  0,  else. 

As  in  the  continuous  setting,  we  see  that  L*L  is  not  invertible.  Indeed,  the 
null  space  of  L*L  (which  equals  the  null  space  of  L)  consists  of  all  f  G  £(ZP)  whose 
DFT  is  completely  supported  on  the  set-complement  of  every  plane  of  the  form 
{(a,/3, 7)  G  Zp  :  7  =  aifi[q]  +  Pfaiq]  mod  P}.  Part  of  this  null  space  corresponds 
to  a  discrete  “cone  of  missing  information.”  For  example,  when  Q  =  4  and  iphifi 
are  given  by  (16),  the  null  space  of  L  is  the  set  of  all  f  whose  DFTs  are  identically 
zero  on  the  four  planes  in  Zp  consisting  of  the  points  of  the  form  (a,  f3,  a),  ( a ,  (3,  (3), 
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(a,  (3,  —a),  and  ( a ,  (3,  —f3).  The  four  planes  form  a  rough  “cone.”  Any  f  whose  DFT  is 
completely  supported  on  the  “interior”  of  this  cone,  namely  the  set  {(a,  /?,  7)  :  |y|  < 
min{|a|,  |/3|}},  lies  in  the  null  space  of  L.  All  such  f's  are  completely  annihilated  by 
L.  meaning  any  solution  to  L*Lf  =  L*g  is  only  unique  up  to  the  addition  of  such 
functions.  Because  of  this  non- uniqueness  we,  in  Chapter  V,  will  regularize  our  least 
squares  problem  with  penalty  terms  that  will  promote  those  solutions  to  the  normal 
equations  which  are  either  sparse  or  have  sparse  gradients. 

For  now  however,  we  turn  our  attention  to  a  more  in-depth  comparison  of  the 
above  discrete  model  and  the  continuous  model  from  Chapter  II.  We  find  that  despite 
being  fully  rigorous,  the  discrete  model  is  not  entirely  faithful  to  the  motivating  real- 
world  physics,  and  as  such,  take  steps  to  improve  it. 
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IV.  Approximating  continuous  chromotomography  with  a 

generalized  discrete  model 

In  Chapter  II  we  introduced  the  linear  operator  (2)  that  provides  a  continuous- 
variable  model  of  the  CT  camera  system.  In  this  model,  the  operator  L  integrates  a 
hyperspectral  image  /  over  every  line  of  the  form  {{x  —  z  cos  9,y  —  z  sin  9,z)  :  z  €  R}. 
The  a^-projection  of  an  example  of  such  a  line  is  depicted  in  Figure  2(a).  Here  9  is 
chosen  so  that  this  line  has  slope  that  is,  so  that  tan  9  —  Every  point  on  the  line 
seen  in  Figure  2(a)  contributes  to  our  measurement  ( Lf)(x,y,9 ).  In  Chapter  III, 
we  created  a  discrete  version  (18)  of  the  X-ray  transform.  This  operator  behaves 
similarly  to  the  continuous  version  in  that  it  sums  the  values  of  a  data  cube  f  over 
discrete  “lines,”  namely  over  all  points  of  the  form  {(m  —  p^i[q\,n  —  p'02 [<l\ , P)  ■  V  £ 
Zp}.  Figure  2(b)  depicts  these  points  for  a  discrete  “line”  that  has  the  same  slope 
as  the  continuous  line  of  Figure  2(a),  namely  one  in  which  ij)i[q]  =  2,  ^2 [<?]  =  1- 

Note  there  are  two  main  differences  between  the  continuous  version  seen  in 
Figure  2(a)  and  the  discrete  version  depicted  in  Figure  2(b).  The  first  difference  is 
that  due  to  the  periodicity  in  the  discrete  model,  the  discrete  line  “wraps  around”  the 
edges  of  our  grid.  We  do  not  address  this  issue  here.  The  second  difference  is  that  the 
discrete  operator  only  takes  into  account  the  integer  multiples  of  ('ipi[q\,  1), 

and  therefore  “skips  over”  the  pixels  that  lie  between  p(i/)i[q],  ^[q],  ~  1)  and  (p  + 
l)('ipi[q\,'ip2[q],  —1)-  This  chapter  attempts  to  improve  our  discrete  model  by  taking 
these  “skipped”  pixels  into  account. 

In  particular,  rather  than  attempt  to  fully  reconcile  the  discrete  operator  L 
with  the  continuous  one,  L,  we  instead  try  to  relate  L  to  a  periodic  analog  of  L.  To 
be  precise,  let  T p  R/(P Z)  be  the  real  numbers  modulo  P.  Recall  the  operator 
L  is  applied  to  functions  /  :  R3  — y  R.  We  now  generalize  it  to  another  operator  Lp 
that  is  applied  to  the  P-periodic  functions  /  :  T3P  — >  R.  To  do  this,  recall  that  L 
integrates  /  over  lines  of  the  form  {(re  —  zcos9,y  —  z sin 9,z)  :  z  G  R}.  In  order  for 
the  image  of  these  lines  in  Tf>  to  have  finite  length,  it  turns  out  that  we  must  only 
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(a)  (b) 


Figure  2:  (a)  In  Chapter  II,  we  investigate  the  continuous  X-ray  transform  (2)  which,  of  all  the  models  presented 
here,  provides  the  most  accurate  representation  of  the  underlying  real  world  physics  of  the  CT  camera 
system.  This  operator  L  integrates  over  lines  of  the  form  (x  —  z  cos  0,y  —  sin  0,z).  In  this  figure  we 
depict  the  ^-projection  of  one  of  these  three-dimensional  lines,  namely  the  one  in  which  tan#  =  | 
and  (x,y)  =  (0,0).  In  Chapter  II,  we  discuss  how  L*L  is  a  filter  and  can  thus  be  diagonalized  by  the 
continuous  Fourier  transform.  Examining  L*  L  in  the  Fourier  domain  allows  us  to  identify  the  “cone  of 
missing  information,”  that  is,  the  frequency  information  of  /  annihilated  by  L*L.  Unfortunately,  it  is 
difficult  to  make  the  theory  from  that  chapter  rigorous,  motivating  us  to  turn  to  the  discrete  theory  in 
Chapter  III.  (b)  In  Chapter  III  we  create  a  discrete  version  of  L,  that  is,  we  create  the  operator  L  that 
sums  over  discrete  “lines”  of  the  form  (m  —  pip i  [q],n  —  pip 2 [9] , p) .  In  this  figure  we  depict  a  discrete  version 
of  the  continuous  line  depicted  in  (a),  namely  one  in  which  (ipi[q],  ip2[q])  =  (2, 1).  In  Chapter  III,  we  show 
that  L*L  convolves  f  with  a  sum  of  characteristic  functions  of  lines  of  this  form  and  can  be  understood 
by  applying  the  DFT  to  it.  In  a  similar  manner  to  Chapter  II,  investigating  L*L  in  the  Fourier  domain, 
we  see  that  L*L  eliminates  the  frequency  components  of  f  whose  DFT  is  completely  supported  on  the 
complement  of  every  plane  of  the  form  7  =  aipi  [q]  +  (3ip2  [q]  mod  P.  This  roughly  corresponds  to  a  discrete 
“cone  of  missing  information.”  In  this  chapter,  we  focus  on  generalizing  the  discrete  theory  of  Chapter  III 
so  as  to  make  it  better  approximate  the  continuous  theory  of  Chapter  II.  To  do  this,  we  modify  the  discrete 
“line”  in  (b)  to  make  it  look  more  like  the  continuous  line  (a). 


use  angles  6  for  which  tan  6  is  rational.  That  is,  we  take  Lp  to  integrate  over  lines 
of  the  form  {(x  —  zifi i  [q] ,  y  —  z)  :  z  G  Tp}  where,  for  any  angle  parameter  q, 

we  require  Vh  [?] ,  V>2  [<?]  £  Z.  That  is,  for  any  /  G  L2(T3P )  we  let 

(LPf)(x,y,q)  :=  J  f(x  -  z^q],  y  -  zip2[q],  z)  dz.  (26) 

TP 

For  example,  for  P  =  13  and  foiol)  —  (2, 1),  the  line  that  LP  integrates  over 

is  depicted  in  Figure  3(a).  To  see  how  a  continuous  integral  over  such  a  line  can 
be  related  to  a  discrete  sum  over  the  “line”  in  Figure  2(b),  it  is  helpful  to  overlay 
Figure  3(a)  with  a  grid  of  pixels;  see  Figure  3(b).  In  particular,  we  claim  that  if  /  is 
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(a)  (b)  (c) 


Figure  3:  (a)  To  reconcile  the  operators  L  and  L  alluded  to  in  Figure  2,  we  introduce  a  periodic  version  of  the 
continuous  X-ray  transform.  This  continuous,  periodic  operator  Lp,  formally  defined  in  (30),  integrates 
over  lines  on  a  torus,  which  forces  the  “wrap  around”  effect,  depicted  here.  While  this  model  is  not  as 
physically  realistic  as  L — space  and  color  are  not  cyclic — it  seems  more  realistic  than  the  discrete  periodic 
operator  L.  However,  this  model  is  not  ideal  because,  being  analog  rather  than  digital,  it  cannot  exploit 
modern  computational  hardware.  In  order  to  make  this  continuous-periodic  model  more  conducive  for 
such  computations,  we,  in  Chapter  IV,  take  steps  to  bridge  the  gap  between  it  and  our  non-realistic  but 
computational- friendly  discrete-periodic  operator  L.  (b)  Intuitively,  if  a  continuous,  P- periodic  function 
/  were  sampled  finely  enough,  we  should  be  able  to  discretize  its  X-ray  transform  Lpf  in  terms  of  finite 
sums.  Graphically,  we  can  think  of  this  process  as  overlaying  our  continuous  torus  with  an  integer  grid,  as 
depicted  here.  Each  square  represents  a  region  over  which  we  assume  our  continuous  function  /  assumes  a 
constant  value.  Identifying  this  continuous  function  with  its  integer  samples  f  we,  in  Theorem  2,  show  that 
the  continuous-periodic  X-ray  transform  of  /  indeed  equals  a  weighted  version  of  the  discrete-periodic  X-ray 
transform  of  f.  (c)  In  order  to  relate  Lpf  to  the  discrete  theory  of  Chapter  III,  it  turns  out  it  is  necessary 
to  generalize  our  discrete  periodic  X-ray  transform  L  to  another  operator  K  that  contains  additional 
weight  parameters  w[a,  &].  As  explained  in  Theorem  2,  these  weights  are  related  to  the  proportion  of  our 
continuous  line  that  lives  in  any  given  square.  In  essence,  we  form  a  discretized  version  of  (a)  by  overlaying 
it  with  a  grid  (b).  This  results  in  a  weight  function  (c).  That  is,  when  compared  to  a  sum  over  the 
points  depicted  in  Figure  2(b),  summing  over  the  weights  in  (c)  does  a  better  job  of  approximating  the 
integral  over  the  line  depicted  in  (a).  After  Theorem  2,  the  remainder  of  this  chapter  is  devoted  to  showing 
that  K  shares  many  of  the  nice  computational  properties  of  L.  That  is,  the  point  of  this  chapter  is  to 
show  how,  under  slight  modification,  the  discrete  perspective  of  Chapter  III  can  indeed  be  used  to  form  a 
computationally  efficient,  mathematically  rigorous,  and  physically  realistic  discretization  of  the  continuous 
theory  in  Chapter  II. 


sampled  at  a  high  enough  resolution  (i.e.,  if  the  grid  in  Figure  3(b)  is  fine  enough) 
the  integral  of  /  over  the  continuous  line  is  essentially  equal  to  a  weighted  sum  of  the 
integer  samples  of  /;  see  Figure  3(c).  Moreover,  the  values  of  these  weights  follow 
patterns  similar  to  our  discrete  “line”  in  Figure  2(b).  The  remainder  of  this  chapter 
is  devoted  to  making  these  notions  rigorous  and  then  exploring  their  consequences. 
As  before,  for  the  sake  of  readability  we  suppress  the  domains  of  the  indices  of 
summation;  sums  over  the  variables  m,n,p,a,b,c  range  over  Zp,  while  those  over 
the  variable  q  range  over  Z q. 

Specifically,  we  make  the  simplifying  assumption  that  our  continuous-variable 
periodic  function  /  :  T3P  — >  M  has  been  sampled  so  finely  that  it  is  effectively 
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piecewise  constant.  That  is,  letting  f  G  £(h p)  denote  the  integer  samples  of  /,  we 
assume  that  for  almost  every  ( x,y,z )  G  T p,  f(x,y,  z)  has  the  form 


f(x,y,z)  =  ^  fk6.c]1(0i6iC)+[_i)i]3(a;>S/>2;)- 


a,6,c£Zp 


(27) 


Here,  the  function  1,  ,  ,  r  ,  ,i3  :  Tp  — >•  M  is  the  characteristic  function  of  the  cube 

(a,6,c)+[—  Y 


(x,  y,  z)  G  T|>  :  a  -  -  <  x  <  a  +  -,  b-  -  <y  <b+  c--<z<c+-\. 


2  2 


2  2 


Replacing  (a,  b,  c )  with  (a',  6',  c')  and  substituting  this  expression  for  /  into  our  con¬ 
tinuous  periodic  operator  (26)  gives 

(Lpf)(x,  y,  q)  =  /  ^  £[a',b',c!]l{al  bl  cl)+^_^z{x  -  #i[g],p  ^  zfaiq],  z)  dz. 

rrn  a'b'd 

lip 

To  proceed,  we  decompose  the  integral  over  Tp  =  [—  \,P  —  |)  into  a  sum  of  P 
integrals,  each  over  an  interval  of  the  form  \p  —  |,p  +  |)  for  some  p  =  0, ...,  P  —  1: 


(■ Lpf){x,y,q )  = 


E  /  E 

P  i  a',b',d 
P~  2 


f  [a? ,  6' ,  c']  1  (a,  c,} + 1-_  i  i  j  3  (x  -  2^1  [?],y-  ^2  [g],^)  dz. 


Distributing  this  integral  over  finite  sums  and  scalar  multiples  gives 


p+ 


(LPf)(x,y,q)  =  J2  {[a'rb',c ']  J  l(a,6,c/)+[_  1^3(2;  -^i[g],p-^2[g],-)d^. 

p  a' ,b' ,c' 


P~r 


At  this  point,  we  further  assume  that  (x,y)  lies  on  the  integer  grid.  That  is,  we 
assume  that  (x,  y)  =  ( m ,  n)  for  some  m,  n  G  Z.  Thus,  for  any  p,  we  can  substitute 
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(x  —  p'lpi  [q]  —  a,  y  —  p-0'2 [q]  —  b,p  —  c)  for  (a',  b',  c'),  yielding 


(■ LPf)(x,y,q ) 

=  EEfi*  -p*l>i[q]  -  a,y-p^M  -  b,p  -  c] 

p  a,b,c 

p+l 

*  J  -  z,pM'v " **»M’*> ** 

p-i 

=  EE  wja,6,  c]i[x  -pipi[q\  -  a,y-pipi[q]  -  b.  p  -  c], 

p  a,b,c 

where  wq  G  £(h p)  is  the  (/-dependent  weight  function 


p+h 


W«M’C1  :=  J  d^r. 


p~\ 


We  now  simplify  our  expression  for  wq.  First,  by  writing  out  the  formula  for  the 
characteristic  function  and  simplifying,  we  observe  that 


1  iis ((z  -  p)rl>i[q\,  (z -  p)^2[q\,p-  z)  d z. 

2  ’  2  J 


Next,  for  any  fixed  p  we  substitute  t  =  z  —  p  to  obtain 


1 

2 

w q[a,b,c}  =  j  l(a)M+[_i  i]3(^iM,#2[g],-^)dt.  (28) 

_  1 
2 

Here,  as  t  ranges  from  —  |  to  (h0i  [g] ,  tfo  [<?] ,  —t)  traces  out  a  segment  of  the  line 
that,  for  a  given  q.  passes  through  (0,0,0)  and  (i/)i[q],  ^[q],  ~ !)•  This  line  corre¬ 
sponds  to  the  same  discrete  “line”  that  we  sum  over  in  our  discrete  CT  model  (18). 
Examining  (28),  we  further  note  that  since  —  |  <  t  <  |,  in  order  for  w q(a,b,c)  to 
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be  non-zero,  we  must  have  that  c  =  0.  Therefore,  we  can  write  (28)  as  a  function  of 
only  two  variables: 

1 

2 

w q[a,b]  =  I  l(ai6)+^_ijij2(t^1[g],^2[g])df.  (29) 

_  1 
2 

We  now  summarize  these  results  in  the  following  theorem: 

Theorem  2.  Making  the  assumption  that  the  hyperspectral  data  cube  f(x,y,z )  is  a 
piecewise  constant  (27)  ,  the  continuous  periodic  X-ray  transform  (26)  of  f  :  Tp  — y  M 
at  any  ( x ,  y)  =  (m,  n )  G  Z2  is  a  sum  of  the  form 

( LPf)(m ,  n,  q)  =  ^  w^a’  “  P'lP\[q]  -  a,  n  -  pf)2[q]  -  b,p\  (30) 

pEZp  a,£>EZp 

where  w q[a,b]  is  given  by  (29). 

Inspired  by  this  result,  given  any  if  i,ip2  G  ^(Zq,  Z)  and  weight  functions 
{wq}q&q  —  £(^p)>  consider  the  operator  K  :  t(fL p)  — *  £(Zp  x  Zq), 

(Kf)[m,n,g]  =  ^  ^  wja,  6]f[m  -  pifi[q}  -  a,  n  -  p^2[q\  -  b,p\.  (31) 

pEZp  a,frEZp 

Theorem  2  tells  us  that  the  continuous  X-ray  transform  of  /  G  L2(Tp)  can  be  well- 
approximated  by  sums  of  the  form  Kf  where  f  G  £(h p),  provided  the  weights  are 
taken  according  to  (29).  Moreover,  note  that  our  formula  (31)  for  K  is  only  a  slight 
generalization  of  the  formula  (18)  of  our  purely  discrete  X-ray  transform  L.  As  such, 
it  is  reasonable  to  hope  that  the  theory  that  we  developed  in  Chapter  III  will  carry 
over  to  this  setting.  We  now  show  that  this  is  indeed  the  case. 

In  the  derivations  that  follow,  it  is  helpful  to  regard  the  two-dimensional  func¬ 
tion  wg  G  £{fL p)  as  a  three-dimensional  function  wq  G  f'(Zp)  where  w q[a,b,c]  := 
wjo,  6]<5o[c].  Note  this  implies  w q[a,b\  =  ^cWg[a,  b,  c],  at  which  point  our  formula 
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for  K  becomes 


(Kf )[m,n,  q]  =  EE  w q[a,  b,  c]f[m  -  pipi[q]  -a,n-  p^2[q\  -b,p-c], 

p  a,b,c 

As  in  Chapter  III,  it  is  not  hard  to  show  that  this  K  is  a  well-defined  bounded 
linear  operator  from  £( Zp)  into  £(Z2P  x  Zq).  Moreover,  our  method  for  computing 
L*  generalizes  to  a  method  for  computing  K*: 

<f,K*g) 

—  (Kf,  g) 

=  Kf)[m,n,q])*g[m,n,q\ 

m,n  q 

=  EEE  (wq[a,b,c]f[m  —  pipi[q]  -  a,  n  —  pip2[q]  ~  b,p  -  c])*g[m,  n,  q]. 

a,b,c  q  m,n,p 

Continuing,  for  any  q  G  Zq  and  a,  b,  c  G  Zp,  we  make  the  substitution  (m' ,n' ,p')  = 
(m  —  p'ljji  [q\  —  a,  n  —  p'i/j \  [q\  —  b,p  —  c).  Our  expression  for  (f,  K*g)  then  becomes 

YY1  (f  [m',ri,i/])*(wq[a,b,c])* 

a,b,c  q  m' ,nf  ,p' 

x  g [m!  +  (p1  +  c)ipi[q]  +  a,  n'  +  (p  +  c)ip2[q]  +  b,  q] 

=  Y  (f [m',n',p'])* 

m'  ,n'  ,p' 

x  Y  Y(w^a’  b’  c])*g[m'/  +  ip'  +  c)^i[<?]  +  a>  ri  +  (p  +  c)^2[g]  +  b,  q]. 

q  a,b,c 

At  this  point,  we  replace  (a,  6,  c)  with  its  negative,  and  write  (wj— a,  —  6,  —  c])*  as 
w,[a,  b ,  c];  in  the  signal  processing  literature,  the  conjugate-reversal  wq  of  w  is  known 
as  the  involution  of  w.  This  yields  the  following  expression  for  (f.K*g): 

Y  (fK> nVD*  c^m' +  (p'  ~  _  a’ n>  +  (p7  “  c)^[g]  -  b,  q\. 

m'  ,n'  ,p'  q  a,b,c 
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Thus,  the  adjoint  of  K  is  necessarily 


(K*g  )[m,n,p\  =  yywq[a,b,c]g[m  +  (p  -  c)ipi[q\  -a,n+  ip  -  c)i)2[q\  -b,q ]. 

q  a,b,c 

Since  w?[a,  b ,  c]  =  w q[a,  6]<50(c)  and  w q[a,  b ,  c]  =  w q[a,  6]<50(c)  this  becomes 

(K*g)[m, n,p]  =  yt  y  wg[a,6]g[m+pV>i[g]  -  a,n  +  ptp2[q]  ~b,q].  (32) 

qE%Q  a,bEZp 

Note  that  this  expression  for  K*  is  a  generalization  of  that  for  L*.  Indeed,  it  becomes 
L*  provided  we  take  wq  =  <5(0,o).  Similarly,  our  expression  (20)  for  L*L  generalizes 
to  one  for  K*K: 

(K*Kf)[m,  n,p] 

=  ££  W q[a,  b,  c]  (Kf)  [m  +  (p  -  c)^i  [q]  -a,n  +  (p-  c)^2[q]  ~  b,  q] 

q  a,b,c 

=  yy  y^q[a',b',c]wq[a,b,c} 
q  a' ,b' ,c'  a,b,c 

x  f  [m  +  (p  —  p'  —  c)^> i  [q]  —  a  —  a' ,n  +  (p  —  p'  —  c)^2  [<?]  —  b  —  b\  p'  —  c'] . 

p' 

Making  the  substitution  r  =  p'  —  p  +  c,  our  expression  now  becomes 
(KdKf  )[m,n,p\  =  y  y  y  w,[o',  bf,  c']wq[a,  b,  c] 

q  a1  ,b'  ,c'  a,b,c 

x  y^  f  [m  —  rijj i [q]  —  a  —  a' ,  n  —  rfalol  —  b  —  b' , p  +  r  —  c  —  c'). 

r 

Recalling  that  L*L  could  be  expressed  as  a  convolution,  it  is  not  surprising  that 
K*K  follows  a  similar  pattern.  To  see  this,  recall  the  functions  {h q}q^zQ  defined 
in  (22)  which  correspond  to  “characteristic  functions”  of  discrete  “lines.”  The  last 
summations  in  our  above  expression  for  K*K  can  be  written  in  terms  of  convolutions 
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of  f  with  these  hg’s: 

(K*Kf  )[m,n,p\ 

=  £  wq[a',  b\  c']wq[a,  b,c](hq*i)[m~a-a’,n^b-b',p-c- c'] 

q  a'  ,b'  ,c'  a,b,c 

=  W9[a,  b,  c ]  [w,  *  (hg  *  f)]  [m  -  a,  n  —  6,p  —  c] 

9  a,6,c 

=  X]  (w,  *  [Wg  *  (hg  *  f)])  [■ m,n,p\ . 

<? 

Since  convolutions  are  associative,  commutative,  and  distributive,  we  can  rewrite 
this  expression  for  K*K  as 

(K*Kf)  =  I  ^  [(Wg  *  Wg)  *  hg]  I  *  f  .  (33) 

\<?6  ) 

That  is,  K*K  is  the  act  of  filtering  by  ^  (w  *  w9)  *  hg.  As  a  quick  check  on  our 
derivation,  note  that  (33)  becomes  (23)  when  we  take  wq  =  wq  =  <5(0,o,o)  for  all  (l- 

Since  K*K  can  be  expressed  as  a  convolution,  in  order  to  understand  it  better, 
we  once  again  turn  to  the  DFT.  Recall  from  (25)  that 


Ag[a,/3,7]  =  (E*hg)[a;,  fi,  j 


{1,  7  =  ai/>i [q]  +  /3il>2[q\  mod  P, 

0,  else. 


(34) 


Next,  it  is  well-known  that  the  DFT  of  w  is  the  complex  conjugate  of  the  DFT  of 
w.  Indeed,  replacing  (a,  b,  c )  with  its  negative  gives 


(E*Wg)[a,/3,7] 


27ri 

P 


(aa’  +/3b'  -pyc') 


W, 


a,  6,  c 


* 


{(E*Wg)[a,  /3, 7]}*. 
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Now  recall  that  these  wg’s  are  intended  to  be  the  three-dimensional  extensions  of 
those  two-dimensional  weight  functions  defined  in  (29).  That  is,  given  wg  e  £(Zp), 
we  regard  it  as  wq  e  £(Z p)  by  letting  w q[a,b,c]  :=  wja,  6]<50[c].  The  three- 
dimensional  DFTs  of  such  functions  are  equal  to  the  two-dimensional  DFTs  of  their 
restriction  to  Z2P\ 


(E*w,)[tt,/?,7]  = 

a,6,c 

=  ^e-^(aa+/3b+7c)wja,6]50[c] 

a,6,c 

=  ^e-^(aa+/3b)wja,&] 

a,b 

=  (E*w 

As  such,  we  actually  have 

(E*K*Kf)[o;, p, 7]  =  ^  |(E*w,)[a,/3]|2A9[a,/3,7](E*f)[a,/3,7].  (35) 

qezQ 

We  summarize  (31),  (32),  (34),  and  (35)  in  the  following  theorem: 

Theorem  3.  Given  £  £(Zq,Z)  and  {w q}qezQ  Q  -£(Z 2P),  consider  the  corre¬ 
sponding  weighted  discrete  X-ray  transform  K  :  £(Zp)  £(Z2P  x  Z Q), 

(Kf)[m,  n,  q]  =  ^  ^  wja,  b]i[m  -  pif\[q\  -a,n-  p'02[(/]  -  b,p\. 

pEZp  a,6EZp 

T/ie  adjoint  of  K  is  K*  :  £(Zp  x  Zq)  — *  £(Zp)f 

[K*g)[m,n,p]  =  ^2  5Z  (w,[a,&])*g[m  +  p^i[g]  +  a,n  +  p/>2[<?]  +  M]- 
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Moreover,  K*K  :  £(Z3P)  — *  £(ZP  x  TLq)  is  a  filter.  Specifically ,  for  any  f  G  f'(Z'p) 


(E*K*Kf )[«,/?,  7]  =  «[«,  /3,7](E*f)[a,  £,7], 


where  the  eigenvalue  function  k,  e  £(ZP)  is 


«[<*,/?,  7]  =  pY^  ke*w9)[«^]i2 

I JSZq 


7  =  q^i  fe]  + 

else. 


mod  P, 


To  see  how  this  result  helps  us  to  better  understand  K*K.  we  now  examine  n 
in  the  example  where  ifi  and  fi>2  correspond  to  the  “knight’s  moves”  of  (17),  depicted 
in  Figure  1.  In  particular,  for  q  =  0,  we  have  ?/y[0]  =  2  and  fi>2  [0]  =  1.  In  this  case, 
our  expression  (29)  for  our  zeroth  weight  function  is 

1 

2 

w0[a,b\  =  fl{ab)+[_h^(2t,t)dt 

_  1 
2 


Note  w0[a,  b]  =  0  unless  6  =  0.  Moreover,  for  b  =  0  it  turns  out  that  w0[a,  b]  0 
only  when  a  =  —1,  0, 1.  To  be  precise,  in  this  case,  it  is  easy  to  show  that 


w0 [a,  6] 


1,  (a,  b)  —  (0,  0), 

|,  (a,6)  =  (l,0), 

< 

l,  (a,6)  =  (- 1,0), 

0,  else. 
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Meanwhile  when  q  =  1,  we  have  ^q[l]  =  1,  "02 [1]  =  2,  and  so 


wi[a,  b] 


1  iia(t,2i)dt. 

2  ’  2  J 


The  calculation  of  Wj  yields  a  result  similar  to  the  one  for  w0: 


Wi[o,  b] 


h  (a,6)  =  (0,0), 

b  («>&)  =  (0, 1), 

< 

b  (a,b)  =  (  0,-1), 


0,  else. 


Further  calculations  shows  that  w3,  w4  and  w7  are  all  equal  to  w0,  while  w2,  w5 
and  Wg  equal  W],  reflecting  the  inherent  symmetries  of  the  “knight’s  moves”  set.  In 
order  to  understand  the  ramifications  of  Theorem  3  in  this  setting,  we  must  compute 
the  DFTs  of  the  w9’s.  In  particular,  the  DFT  of  w0  is 


(E*wo)[a,0]  =  ^e-^(aa+/36)woM 

a,b 

If)  1  2-7ri  1  _  271 

=  2e  +4e  T  +4e  ' 

-  I  +  7,‘-,j:=(V) 

=  i(l  +  cos  (?*?)) 

= cos2  (¥)  ■ 


In  a  similar  manner,  it  can  be  shown  that  the  DFT  of  wq  (which  equals  that  of 
w2,w5,  and  w6)  is  cos2  (7 t/3/P).  Thus,  for  the  “knight’s  moves”  example,  K*K  acts 
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as  a  pointwise  multiplication  operator  in  the  Fourier  domain,  multiplying  E*f  by  k 
which,  by  Theorem  3,  is  given  by 


[a,  £,7]  =  X!  l(E*w9)[a,/3]|2\[a,/3,7] 

1,  7  =  m/q[g]  +  fifoiq]  mod  P, 


=  P  cos4(^) 

96(0,3,4,7} 


Pcos4(f)  5] 

96(1,2,5,6} 


0,  else 

1,  7  =  m/q  [g]  +  /3^2[g]  mod  P, 
0,  else. 


We  now  compare  this  result — obtained  via  the  more  realistic  CT  model  introduced  in 
this  chapter — against  that  given  in  Chapter  III.  There,  the  unweighted  discrete  op¬ 
erator  L*L  also  acts  a  pointwise  multiplication  operator  in  the  Fourier  domain.  But 
whereas  E*K*Kf  =  f«E*f  where  k  is  given  above,  L*L  instead  satishes  E*L*Lf  = 
AE*f,  where  A  is  an  unweighted  version  of  k,  namely 


A  [a,  /?, 7' 


/ 


V 


7  =  m/q[g]  +  ft  ip2[q] 


else. 


mod  P, 


That  is,  the  main  difference  between  the  two  approaches  is  the  presence  of  the 
cos4(^)  and  cos4(^)  weights  in  the  more  realistic  model — more  realistic  in  the 
sense  that  K  more  closely  imitates  the  continuous  X-ray  transform  than  L  does. 
These  weights  in  frequency  space  are  nonnegative,  no  more  than  1  in  magnitude, 
and  decay  to  0  at  the  edges  of  our  [—  |vf-]2  grid.  This  decay  means  that  K*K  does 
a  worse  job  of  preserving  f’s  high  frequency  content  than  L*L  does.  This  makes 
sense:  by  design,  K*K  does  a  better  job  modeling  the  continuous  “blur”  induced 
by  our  prism.  Such  blurring  will,  in  reality,  destroy  high  spatial  frequency  content. 
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We  thus  see  that  we  pay  a  price  for  making  our  discrete  model  more  faithful  to  the 
continuous  one:  given  L*Lf,  we  can  recover  all  parts  of  the  DFT  of  f  that  lie  on 
the  planes  perpendicular  to  {(tpi[q],  ~ ^)}q£ZQ]  given  K*Kf,  the  information 

contained  in  the  edges  of  these  planes  is  unreliable. 

We  emphasize  that  the  above  observations  regarding  L*L  versus  K*K  were 
made  in  the  special  case  where  {('ipi[q\,  ^2[(l\)q&Q  corresponded  to  the  knight’s  moves 
of  (17).  We  do  not  know  whether  this  same  behavior  occurs  in  general.  Nevertheless, 
there  are  some  things  we  can  say.  First,  both  the  operator  L*L  of  Theorem  1  and 
K*K  of  Theorem  3  are  filters,  and  the  only  difference  between  their  eigenvalue  func¬ 
tions  A  and  k  is  the  presence  of  additional  weight  terms  in  n.  More  can  be  said  when 
we  require  these  weights  |  (E*wg)  [a,  j3\  | J  to  arise  from  wg’s  of  the  form  (29),  such  as  is 
needed  for  Theorem  3  to  hold.  In  particular,  such  w9’s  necessarily  have  nonnegative 
values.  Moreover,  like  we  saw  in  the  above  example,  these  values  necessarily  sum  to 
1: 


2 

W?M  =  51  / 


a,b^Zp 


a,6EZp 


/(e  v 

,J  V, 


I  xa,feEZp 


b)+[-5>5]‘ 


2 

1 

2 

=  r  i, 


-5-P+3 


'■(t^i[q],t^2[q\)  dt 


—  Id  t 


2 

=  1. 


45 


This  forces  the  corresponding  weights  in  the  frequency  domain  to  be  no  more  than  1 
for  every  (a,  (3)  G  Z|>: 

|(E*w,)[a,/3]|2  = 

< 


In  order  to  determine  the  rate  of  decay  of  |(E*wg)[a, /3]|2,  we  would  probably  need 
to  first  find  explicit  expressions  for  the  DFTs  of  weight  functions  that  have  the  form 
of  (29).  Our  preliminary  investigation  of  this  quantity  leads  us  to  believe  that  such 
a  computation  is  nontrivial.  As  such,  we  leave  it  for  future  work. 

Regardless  of  whether  we  use  L*L  or  K*K.  we  are  still  faced  with  solving  a 
linear  system  with  a  large  null  space.  Because  of  this,  we  now  regularize  our  least 
squares  problem  with  penalty  terms  that  will  promote  those  solutions  to  the  normal 
equations  which  are  either  sparse  or  have  sparse  gradients. 
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V.  Chromotomographic  reconstruction  via  the  Split 

Bregman  Method 

Throughout  this  thesis  our  efforts  have  been  focused  toward  creating  and  understand¬ 
ing  a  discrete  model  of  the  X-ray  transform.  From  a  computational  standpoint,  a 
discrete  model  is  necessary:  the  best  reconstruction  algorithms  are  discrete  in  nature, 
and  make  use  of  modern  computational  hardware.  In  the  following  discussion,  we 
present  the  ideas  in  terms  of  the  discrete  X-ray  transform  L  of  Chapter  III.  However, 
the  same  ideas  can  easily  be  applied  to  its  generalization  K  given  in  Chapter  IV. 
At  its  core,  the  CT  reconstruction  problem  involves  computing  the  f  that  solves  the 
equation  Lf  =  g  for  given  camera  measurements,  g.  However,  as  mentioned  in  the 
introduction,  if  our  measurements  are  noisy,  that  is  g  =  Lf0  +  e,  this  equation  may 
not  even  have  a  solution. 

This  issue  motivates  us  to  turn  to  the  method  of  linear  least  squares.  That  is, 
finding  the  f  that  makes  Lf  as  close  to  g  as  possible  by  solving  argminf  ||Lf  —  g|||- 
Since  L  is  a  bounded  linear  operator,  the  standard  Hilbert  space  theory  of  linear 
least  squares  tells  us  that  we  can  solve  this  minimization  problem  by  instead  solving 
the  normal  equations,  L*Lf  =  L*g.  To  do  this,  in  the  previous  chapters  we  have 
rigorously  derived  expressions  for  L*  and  L*L.  Further  investigation  gave  us  that 
L*L  is  a  type  of  operator  known  as  a  filter  that  is  best  understood  from  a  frequency- 
based  perspective.  In  particular,  after  examining  it  in  the  Fourier  domain,  we  find 
that  L*L  destroys  information;  see  Chapter  III.  Specifically,  L*L  has  a  nontrivial 
null  space  consisting  of  those  functions  whose  Fourier  transforms  are  supported  on 
the  set-complements  of  certain  discrete  planes. 

Due  to  the  size  of  L*L’s  null  space,  there  are  an  infinite  number  of  solutions 
to  the  normal  equations,  meaning  we  must  make  some  further  assumptions  about 
the  f's  we  wish  to  reconstruct.  The  intended  purpose  of  the  CT  camera  system 
is  to  monitor  transient  events  that  take  place  in  natural  scenes.  In  particular,  if 
the  transient  event  is  very  bright  relative  to  the  surrounding  background,  such  as 
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a  fireball,  it  is  reasonable  to  restrict  f  to  be  sparse ,  meaning  it  has  few  nonzero 
entries.  In  addition  to  being  a  natural  model  for  such  signals,  this  allows  us  to 
make  use  of  the  many  recent  developments  in  the  fields  of  sparse  signal  processing 
and  compressed  sensing.  To  be  precise,  of  the  infinite  number  of  solutions  f  to  the 
normal  equations,  we  would  like  to  choose  the  one  solution  whose  zero  “norm”  ||f  ||o 
is  as  small  as  possible.  Here  || f  || 0  denotes  the  number  of  nonzero  entries  of  f.  That 
is,  in  order  to  find  the  sparsest  solution  to  the  normal  equations,  we  want  to  solve 

argmin||f||0  subject  to  L*Lf  =  L*g. 

f 

However,  the  zero  “norm”  is  not  well-behaved:  it  is  not  a  norm,  nor  is  it  differentiable, 
nor  is  it  convex.  Instead  we  will  take  the  now-standard  approach  of  using  the  1-norm 
as  a  proxy  for  the  zero  “norm,”  and  attempt  to  solve 

argmin  ||f  || x  subject  to  I/Lf  =  L*g.  (36) 

f 

Here,  the  1-norm  of  f  G  f'(Zp)  is  defined  to  be 


mEZp  nEZp  pEZp 


It  turns  out  that  due  to  the  size  of  L’s  null  space,  finding  a  direct  solution  to  (36)  is 
still  too  computationally  difficult.  The  standard  way  of  getting  around  this  issue  is 
to  regularize  (36)  by  adding  a  term  that  penalizes  large  values  of  || Lf  —  g|||.  That 
is,  to  find  a  sparse  f  for  which  Lf  is  close  to  g,  we  want  to  solve  the  unconstrained 
optimization  problem 

argmin  ||f||i  +  ^||Lf-g||2,  (37) 

f  z 

where  p  is  some  experimentally  chosen  weight.  This  problem  is  itself  highly  nontriv¬ 
ial.  In  short,  if  our  task  was  to  simply  solve  argminf  || Lf  —  g|||  we  would  be  able 
to  find  (many)  solutions  using  standard  linear  least  squares.  On  the  other  hand, 
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if  L  was  the  identity,  it  is  known  that  we  can  solve  minf  ||f||i  +  ^||f  —  g |||  using  a 
technique  known  as  shrinkage,  or  soft-thresholding : 

f  =  shrink (g,  ±)  :=  sign(g)  0  max{|g|  -  J, 0}. 

Here  the  sign  and  max  functions  are  evaluated  coordinate-wise,  and  the  ©  denotes 
pointwise  vector  multiplication.  Unfortunately,  our  problem  does  not  fit  into  either 
of  these  “simple”  frameworks,  and  we  must  use  a  more  complicated  approach.  In 
short,  it  is  the  combination  of  minimizing  ||f||i  and  the  2-norm  of  a  function  of  f 
that  makes  finding  a  solution  to  (37)  so  difficult. 

To  make  this  hard  optimization  problem  more  tractable,  we  instead  attempt 
to  solve  a  modified  version  of  (37)  obtained  by  transforming  it  into  a  two- variable 
problem.  That  is,  we  attempt  to  solve 

argmin  ||d||i  +  1| Lf  —  g|||  subject  to  d  =  f. 

r  .1  2 

Of  course,  this  is  simply  a  restatement  of  (37)  and  is  truly  no  easier  to  directly  solve. 
Nevertheless,  this  two-variable  formulation  helps  us  think  about  this  problem  in  a 
new  way.  In  particular,  we  can  attempt  to  enforce  our  d  =  f  constraint  by  adding  a 
new  penalty  term,  once  again  yielding  an  unconstrained  optimization  problem: 

argmin  ||d||i  +  ^||Lf  -  g|||  +  ^||d  -  f|&  (38) 

f  ,d  *  * 

In  practice,  in  order  to  ensure  that  our  Lf  =  g  and  d  =  f  constraints  are  adequately 
enforced,  it  is  common  to  try  to  solve  (38)  for  larger  and  larger  values  of  /i  and  v  and 
use  the  “eye  test”  to  find  the  best  apparent  solution.  Notice  though,  that  as  //  and 
v  get  very  large,  the  1-norm  term  becomes  insignificant  and  we  are  essentially  again 
solving  an  ill-conditioned  problem.  Putting  aside  this  issue  for  the  moment,  note 
that  for  any  fixed  /j  and  v,  it  is  not  obvious  how  to  find  an  optimal  d  and  f.  One 
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approach  to  solving  (38)  involves  alternating  minimizations,  where  we  iteratively 
calculate  new  values  of  f  and  d  according  to: 

ffc+1  =  argmin  ^||Lf  -  g||l  +  ^-\\dk  -  f\& 
f  z  z 

dk+1  =  argmin  \\d\h  +  ^||d  -  ffe+1||'. 
d  z 

The  issue  with  this  approach  is  that  dfc+1  =  shrink(f/c+1, 1/v)  meaning  that  d  and  f 
will  never  be  equal. 

Fortunately,  the  Split  Bregman  method  of  [11]  gives  a  way  to  circumvent  both 
of  these  problems.  In  the  Split  Bregman  method  we,  like  above,  alternate  between 
hireling  the  optimal  d  for  a  given  f  and  finding  the  optimal  f  for  a  given  d.  How¬ 
ever,  unlike  the  above  algorithm,  the  Split  Bregman  method  introduces  extra  “noise” 
parameters  that,  when  taken  into  account,  allow  this  alternating  optimization  ap¬ 
proach  to  converge.  In  order  to  understand  the  Split  Bregman  method,  we  must  first 
understand  its  predecessor,  known  as  Bregman  Iteration. 

5.1  Bregman  Iteration 

Bregman  iteration  [5]  provides  an  alternative  method  for  minimizing  a  given 
convex  functional  E(vl)  according  to  a  constraint  of  the  form  H( u)  =  0.  Here,  H 
is  assumed  to  be  a  differentiable  convex  functional  whose  minimum  value  is  0.  In 
the  context  of  the  CT  reconstruction,  u  will  become  our  pair  of  variables  (d,  f),  and 
H( u)  and  E( u)  will  be  taken  to  be 

B(u)  =  B(d,f)  =  ||d||1  +  |||Lf-g||l, 

ff(u)  =  ff(d,f)  =  i||d-f||i.  (39) 

As  mentioned  in  the  previous  section,  the  traditional  method  for  enforcing  the  con¬ 
straint  H( u)  =  0  is  to  add  a  penalty  term  to  our  objective  function  if(u).  That  is 
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we  attempt  to  solve 


(40) 


arg  min  E(u)  +  vH(  u) 

U 

for  larger  and  larger  values  of  v.  This  however  can  lead  to  numerical  instabilities  for 
large  values  of  v.  As  a  remedy  to  this  problem,  Bregman  iteration  takes  a  different 
approach.  Specifically,  it  uses  the  concept  of  “Bregman  distance,”  which  is  the 
“error”  between  the  convex  functional  E  and  its  supporting  linear  approximation, 
namely  the  function 


de(u>  v)  =  £(u)  -  £(v)  -  (P,  u  -  v). 

Here,  p  is  a  subgradient  of  E  at  v,  namely  a  member  of  the  subdifferential  set 

d E(v)  :=  {p  G  [£(Z3P)}2  :  E{ u)  >  E(v)  +  (p,  u  -  v),  Vu}. 

To  be  clear,  Bregman  iteration  does  not  require  E  to  be  differentiable  but  rather, 
only  convex.  At  any  point  v  where  E  is  differentiable  the  only  p  G  d E  is  its  gradient 
V.E(v).  This  can  be  understood  by  considering  the  first-order  Taylor  approximation, 
or  linearization,  of  E  at  v:  E( u)  «  E(v)  +  (v£(v),u  —  v).  Because  E  is  convex, 
this  approximation  is  from  below,  that  is,  i?(u)  >  E(v)  +  ( VE(v),u  —  v)  for  all  u 
and  so  VE(y)  G  dE(v).  At  points  v  at  which  E  is  not  differentiable,  one  can  show 
that  there  nevertheless  exist  vectors  p  for  which  E( u)  >  E(v)  +  (p,  u  —  v)  for  all 
u.  As  stated  above,  any  such  vector  p  is  called  a  subgradient  of  E  at  v,  and  the 
(nonempty)  collection  of  all  such  subgradients  is  denoted  dE{y). 

In  Bregman  iteration,  rather  than  attempt  to  solve  (40)  over  and  over  again 
for  larger  and  larger  values  of  u,  we  instead  fix  u,  make  an  initial  guess  u°,  and 
iteratively  solve  a  version  of  (40)  where  our  objective  function  E( u)  is  replaced  with 
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the  Bregman  distance  from  a  given  nk : 


ufc+1  =  arg  ruin  (u,  ufc)  +  vH{ u). 

U 

Note  this  requires  us  to  hnd  an  explicit  subgradient  pfc  for  E  at  every  given  ufc. 
In  practice,  this  is  accomplished  by  choosing  u°  to  be  any  minimizer  of  the  un¬ 
constrained  functional  -E(u),  which  permits  us  to  take  our  initial  subgradient  to  be 
p°  =  0.  To  derive  a  formula  that  iteratively  updates  p,  remember  that  H  is  differen- 

k 

tiable,  and  since  ufc+1  minimizes  Oe  (u,  ufc)  +  vH( u),  subdifferential  theory  informs 
us  that 

0  G  d  D £  (u,  uk)  +  vH (u)  |u=ufc+i 
=  d  [E(u)  -  E(uk )  -  (pfc,u  -  uk)  +  vH( u)]  |u=ufc+i. 

Since  —  E(uk)  —  (pfc,  u  —  ufc)  +  uH( u)  is  differentiable,  this  simplifies  to 

0  e  dE(u)\u=uk+i  +  V  [-E(uk)  -  (pfc,  u  -  uk)  +  uH( u)]  |u=u>=+i 
=  SB(u)|„u,+,  -  p*  +  H)(uk+1). 

Since  pA+1  G  dE(uk+1)  this  tell  us  it  suffices  to  take 

p k+1  :=  pfc  -  vVH(uk+1). 

To  summarize,  traditional  Bregman  iteration  is  to  let  u°  be  a  minimum  of  E( u),  let 
p°  =  0,  and  to  then  repeatedly  apply  the  following  two-step  algorithm: 

arg  min  (u,  ufc)  +  uE[( u),  (41) 

U 

pfc  -  uVH(uk+1).  (42) 
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In  [11],  the  authors  focus  on  the  special  case  where  H( u)  is  assumed  to  have  the 
form  ||| Au  —  b|||.  Note  that  this  assumption  is  indeed  valid  in  our  setting  since  we 
define  H( u)  as  H( d,  f)  =  ||d  —  f  |||,  meaning  we  can  take 


r  i 

d 

0 

I  -/ 

,  u  = 

,  and  b  = 

L  J 

f 

0 

The  reason  they  make  this  assumption  in  [11]  is  that  it  simplifies  the  Bregman 
iteration  steps  given  in  (41)  and  (42).  Specifically,  since  V||Au  —  b |||  =  2Ar(Au  — 
b),  (41)  and  (42)  become 


ufc+1  =  arg  min  (u,ufc)  +  vH{ u) 

U 

=  argminTJ(u)  —  E(uk)  -  (pk,u-uk)  +  — ||^4u  —  b||| 

u  2 

=  arg  min  E(u)  -  (pfc,u  -  uk)  +  ^||Au  -  b|||,  (43) 

pA:+1  =  pk  -  uAT(Auk+1  -  b).  (44) 


Moreover,  in  [11]  they  use  the  fact  that  this  two-step  process  has  a  surprisingly 
simpler  formulation:  letting  u°  be  a  minimizer  of  E( u),  p°  =  0  and  b°  =  b,  it  turns 
out  that  the  repeated  application  of  (43)  and  (44)  is  equivalent  to  the  repeated 
application  of 


ufc+1  =  arg  min  E(u)  +  -||Au  -  bfc||l, 

u  2 

(45) 

bfc+i  =  bfc  +  b  _  Auk+i 

(46) 

Looking  at  (46)  we  can  see  the  true  inspiration  behind  Bregman  iteration:  for  a  fixed 
v  the  value  nk+1  given  by  (45)  will  not  exactly  satisfy  the  constraint  Au  =  b;  in  (46) 
we  add  the  “error”  b  —  Aufc+1  back  into  bfc,  thereby  forcing,  in  the  next  iteration, 
the  optimization  (45)  to  better  compensate  for  the  aspects  of  b  that  it  is  missing.  In 
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the  image  denoising  literature — where  Bregman  iteration  was  first  popularized — this 
step  is  known  as  “adding  the  noise  back  in.” 

We  now  apply  these  ideas  in  the  special  case  where  we  have 


B(u)  =  S(d,f) 
//( uj  =  H(d,i) 


l|d|li  +  fl|Lf-g||l. 


(47) 

(48) 


Note  that  here,  the  initialization  of  picking  u°  to  be  a  minimum  of  E( u)  and  b°  =  b 
corresponds  to  taking  d°  =  0,  f°  to  be  any  solution  to  L*Lf  =  L*g  and  b°  =  0. 


5.2  Split  Bregman  Iteration 

We  now  turn  to  the  recently  introduced,  state-of-the  art  numerical  optimization 
technique  known  as  the  Split  Bregman  method  in  order  to  solve 

argmin  ||d||i  +  ^||Lf  -  g\\22  +  ^||d  -  f\\\. 

d.f  Z  l 

Letting  d°  =  0,  f°  be  any  solution  to  L*Lf  =  L*g,  and  b°  =  0,  applying  the  Bregman 
iteration  steps  (45)  and  (46)  to  the  functionals  given  in  (47)  and  (48)  gives 

(d‘+1,  f‘+1)  =  argmin  ||d||,  +  £||Lf  -  g|||  +  b|d 

d.f  Z  l 

bfc+1  =  bk  +  b  -  (dfc+1  -  ffc+1). 


-  f  -  bfc|||,  (49) 

(50) 
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This  is  Bregman  iteration  in  this  particular  setting.  The  Split- Bregman  method 
involves  splitting  the  numerically  difficult  optimization  (49)  into  two  steps: 


ffc+1  =  argmin  ^||Lf  -  g|||  +  ^||dfc  -  f  -  bA||l,  (51) 

f  z  z 

d‘+1  =  Mg  min  ||d||,  +  h|d  -  f‘+1  -  b‘|&  (52) 

d  Z 

bfc+1  =  bfc  +  b- (dfc+1 -ffe+1).  (53) 

We  now  discuss  the  optimizations  (51)  and  (52)  in  greater  detail.  To  be  precise,  (51) 
can  be  solved  using  standard  linear  least  squares: 


fi:+1  =  argmin 

f 


\/h  L 


[f]- 


^(dk  -  bfc) 


This  is  equivalent  to  solving  the  corresponding  set  of  normal  equations: 


(//L*L  +  ul)ik+1  =  //L*g  +  ul{dk  -  bk). 

Namely,  we  let  ffc+1  =  (//L*L  +  z/I)_1[//L*g  +  u\(dk  —  bfc)].  Here,  /iL*L  +  ul  is  a 
filter  meaning  we  can  use  the  DFT  to  invert  it.  To  be  precise,  in  Chapter  111,  we 
showed  how  to  use  the  DFT  to  unitarily  diagonalize  L*L  in  terms  of  its  eigenvalues 
A[a,/3, 7].  The  eigenvalues  of  /jL*L  +  ul  are  given  by  the  function  //A[a, /?,  7]  +  u. 
The  second  step  (52)  of  our  Split  Bregman  iteration  can  be  solved  using  shrinkage: 

dfc+1  =  shrink(ffe+1  +  bfc,  i) 

=  sign(ffc+1  +  bk)  0  max{|ffc+1  +  bfc|  -  i,  0}. 

To  be  clear,  Split  Bregman  iteration  truly  calls  for  new  f’s  and  d’s  to  be  calcu¬ 
lated  until  the  difference  between  f  and  d  falls  within  a  pre-determined  threshold,  at 
which  point  a  new  b  is  calculated  and  the  process  is  repeated.  That  is,  one  is  truly 
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supposed  to  alternate  between  (51)  and  (52)  many  times  before  updating  b  in  (53). 
However,  the  authors  of  [11]  indicate  that  this  method  is  not  desirable.  They  instead 
advocate  calculating  f,  d,  and  b  consecutively,  claiming  that  for  many  applications, 
optimal  efficiency  is  obtained  with  only  one  iteration  of  the  “inner  loop.”  This  is 
because  the  advantage  gained  by  a  slight  refinement  of  f  and  d  pales  in  comparison 
against  the  benefit  of  updating  b.  We  now  summarize  this  algorithm: 

Split  Bregman  Algorithm  for  discrete  chromotomographic  reconstruction: 

While  \\{k  —  ffc_1||2  >  tolerance 

ffc+1  =  argmin  |||Lf  -  g|||  +  V- ||dfc  -  f  -  bfc||| 

dfc+1  =  argmin  ||d||i  +  -||d  -  ffc+1  -  bfc||| 

d  2 

bk+ 1  =  bk  +  b  -  (dfc+1  -  ffc+1) 

end 

5. 3  Numerical  Experimentation 

In  Chapter  IV  we  introduced  a  generalization  K  of  our  discrete  X-ray  transform 
L  that  was  more  faithful  to  the  underlying  real-world  physics  of  CT  cameras.  At 
the  end  of  that  chapter,  we  discussed  an  example  which  showed  that  the  price  of 
this  realism  is  a  larger  null  space.  In  short,  the  operator  K  is  not  as  well-behaved 
as  L.  and  so  we  expect  the  problem  of  reconstructing  f  from  Kf  to  be  harder  than 
that  of  reconstructing  it  from  Lf.  This  begs  two  questions.  First,  how  much  of  a 
price — from  the  standpoint  of  numerical  reconstruction  error — do  we  actually  pay  for 
using  a  more  realistic  model?  Second,  does  the  numerical  experimentation  support 
our  hope  that  real-world  CT  reconstruction  is  actually  possible? 

In  this  section,  we  provide  some  preliminary  numerical  experimentation  that 
answers  both  of  these  questions.  We  find  that  reconstructing  from  Kf  is  indeed 
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less  reliable  than  reconstructing  from  Lf.  However,  the  two  approaches  perform 
similarly,  meaning  that  the  theory  of  Chapter  IV  indeed  delivers  greater  realism  at 
little  cost.  More  importantly,  our  simulations  reveal  that  when  using  either  L  or  K. 
it  is  indeed  feasible  to  reconstruct  a  sufficiently  sparse  hyperspectral  image  f  from  a 
very  small  number  of  CT  camera  measurements. 

To  be  precise,  we  compare  L  and  K  by  synthesizing  a  large  number  of  sparse 
images  f  and  then  applying  the  Split  Bregman  reconstruction  algorithm  twice,  once 
for  L  and  once  for  K.  To  be  clear,  we  are  interested  in  reconstructing  f’s  that  are 
sparse,  that  is  images  with  very  few  bright  spots.  To  test  how  well  we  can  reconstruct 
f  after  taking  measurements  from  K  and  L,  we  create  examples  of  sparse  scenes, 
drawing  inspiration  from  astronomical  images.  In  this  setting  we  expect  a  nearly 
black  background  with  a  few  stars.  To  create  this  scene  we  pick  a  proportion  of  pixels 
that  correspond  to  stars  and  incrementally  increase  this  proportion  to  investigate  how 
L  and  K  operate  on  scenes  that  are  less  and  less  sparse.  Due  to  the  large  null  spaces 
of  both  L  and  K,  we  expect  both  to  perform  poorly  as  f  becomes  nonsparse.  For  each 
given  proportion  of  pixels,  we  randomly  assign  points  in  our  PxP  spatial  grid  where 
these  stars  will  appear,  and  assume  that  at  every  star  location,  10  percent  of  the  star’s 
spectrum  will  be  nonzero.  That  is,  these  stars  are  sparse  in  the  color  spectrum  as  well. 
To  be  clear,  in  our  simulated  data,  we  allow  our  fixed  proportion  of  star  locations 
to  overlap,  just  as  two  stars  may  align  in  our  line  of  sight.  After  simultaneously 
applying  the  Split  Bregman  algorithm  to  the  problem  of  reconstructing  f  from  Lf 
and  Kf,  we  plot  the  relative  error,  ||f  —  Frecon  1 1 2  / 1 1  f  II 2,  of  the  reconstructed  image 
from  the  original  as  a  function  of  the  proportion  of  stars  in  the  scene. 

Looking  at  Figure  4,  we  first  notice  that  as  our  hyperspectral  image  f  becomes 
less  and  less  sparse,  this  algorithm  does  a  poor  job  of  reconstructing  the  original 
image.  This  of  course  makes  sense  because  L  and  K  have  large  null  spaces,  and 
so  it  is  unreasonable  to  expect  to  find  f  unless  it  is  sparse.  Indeed,  the  entire 
motivation  behind  the  Split  Bregman  algorithm  is  the  assumption  of  sparsity.  The 
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graph  also  confirms,  as  we  previously  predicted,  that  we  can  better  reconstruct 
the  original  image  using  data  produced  by  L  as  opposed  to  the  data  produced  by 
the  more  realistic  operator  K.  Indeed,  as  detailed  in  Chapters  III  and  IV,  the 
frequency  components  of  f  that  lie  off  of  certain  planes  are  annihilated  by  both 
operators.  However,  K  also  diminishes  the  high  frequency  components  of  f  that 
lie  on  these  planes.  That  is  why,  when  looking  at  Figure  4,  the  more  realistic 
model  consistently  underperforms  the  less  realistic  one.  This  is  not  a  bad  thing: 
no  real-world  system  would  truly  perform  as  well  as  our  L  does;  meanwhile,  K 
may  indeed  do  a  good  job  modeling  the  underlying  physics,  and  so  its  performance 
curve  is  probably  a  better  indicator  of  when  real-world  CT  reconstruction  could  be 
accomplished  in  practice.  Regardless,  the  investigation  of  these  operators  is  a  step 
towards  better  understanding  the  CT  reconstruction  problem  in  all  settings.  We 
can  use  these  models  to  better  understand  the  limitations  of  current  systems  and  to 
suggest  improvements  to  future  systems. 
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Figure  4:  This  plot  shows  the  magnitudes  of  the  relative  errors  between  randomly  generated  64  x  64  x  64  simulated 
hyperspectral  images  and  their  reconstructions  obtained  via  chromotomographic  measurements.  Specifi¬ 
cally,  it  gives  the  magnitudes  of  these  relative  errors  as  a  function  of  the  proportion  of  stars  in  a  simulated 
astronomical  scene.  For  each  proportion,  30  such  images  were  randomly  generated.  Each  star  had  an 
independently- generated  10%-sparse  color  spectrum.  For  each  simulated  image  f ,  we  computed  the  corre¬ 
sponding  chromotomographic  camera  measurements  g  =  Lf  that  arise  from  our  discrete  X-ray  transform 
as  well  as  those  measurements  g  =  Kf  that  arise  from  its  more-realistic  cousin.  In  both  cases,  we  used 
8  discrete  prism  “angles”  { (^l  [<?] ,  [<?] ) }g— o-  These  corresponded  to  the  “knight’s  moves”  depicted  in 

Figure  1.  For  each  of  these  g’s,  we  then  used  the  Split  Bregman  algorithm  to  reconstruct  f  according  to 
additional  assumption  that  it  is  sparse.  This  is  only  computationally  feasible  due  to  the  Fourier-based 
diagonalizations  of  L*L  and  K*K  that  we  provided  in  Theorems  1  and  3,  respectively;  they  let  us  perform 
the  least-squares  step  of  the  Split  Bregman  algorithm  using  three-dimensional  Fast  Fourier  Transforms. 
We  draw  two  important  conclusions  from  this  graph.  First,  we  see  that  chromotomographic  reconstruction 
via  the  Split  Bregman  algorithm  is  indeed  possible,  and  can  be  very  successful  provided  the  underlying 
image  is  sufficiently  sparse.  Second,  we  have  numerical  confirmation  of  the  observation  we  made  at  the  end 
of  Chapter  4:  it  is  indeed  easier  to  reconstruct  f  from  its  idealized  discrete  X-ray  transform  measurements 
Lf  than  from  its  more  realistic  measurements  Kf.  Based  on  its  greater  faithfulness  to  the  underlying 
physics,  we  believe  the  higher  of  these  two  curves  will  be  a  better  indication  of  the  actual  performance  of 
a  real-world  chromotomographic  imager,  should  such  a  system  be  fielded. 
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